Recognizing Driver Talking Direction in Running Vehicles with a Smartphone

Submitted by aekwall on Mon, 08/03/2020 - 10:16am

Title	Recognizing Driver Talking Direction in Running Vehicles with a Smartphone
Publication Type	Conference Paper
Year of Publication	2019
Authors	Dai, Haipeng, Liu, Alex X., Li, Zeshui, Wang, Wei, Zhang, Fengmin, Dong, Chao
Conference Name	2019 IEEE 16th International Conference on Mobile Ad Hoc and Sensor Systems (MASS)
Keywords	Acoustic Fingerprints, Acoustic signal processing, channel fingerprint extraction, classification accuracy, composability, direction-of-arrival estimation, driver distraction, driver information systems, Driver Talking Direction, driver talking direction recognition, fingerprint, fingerprint based sound source localization approaches, Human Behavior, human factors, in-vehicle environment, microphones, multipath effects, phone placements, pubcrawl, recognition, Resiliency, road safety, safety enhancement, signal classification, smart phones, smartphone, Source separation, speech processing, speech signal, talking status identification, time-of-arrival estimation
Abstract	This paper addresses the fundamental problem of identifying driver talking directions using a single smartphone, which can help drivers by warning distraction of having conversations with passengers in a vehicle and enable safety enhancement. The basic idea of our system is to perform talking status and direction identification using two microphones on a smartphone. We first use the sound recorded by the two microphones to identify whether the driver is talking or not. If yes, we then extract the so-called channel fingerprint from the speech signal and classify it into one of three typical driver talking directions, namely, front, right and back, using a trained model obtained in advance. The key novelty of our scheme is the proposition of channel fingerprint which leverages the heavy multipath effects in the harsh in-vehicle environment and cancels the variability of human voice, both of which combine to invalidate traditional TDoA, DoA and fingerprint based sound source localization approaches. We conducted extensive experiments using two kinds of phones and two vehicles for four phone placements in three representative scenarios, and collected 23 hours voice data from 20 participants. The results show that our system can achieve 95.0% classification accuracy on average.
DOI	10.1109/MASS.2019.00011
Citation Key	dai_recognizing_2019

Groups:

Science of Security VO