Motion Sensor-based Privacy Attack on Smartphones
This reveals a fundamental design vulnerability in smartphones that risks user speech privacy during loudspeaker use, such as in calls or media playback.
The paper tackles the problem of speech privacy leakage on smartphones by exploiting accelerometer readings from loudspeaker reverberations, achieving over 90% accuracy in gender classification and over 80% in speaker identification.
In this paper, we build a speech privacy attack that exploits speech reverberations generated from a smartphone's in-built loudspeaker captured via a zero-permission motion sensor (accelerometer). We design our attack Spearphone2, and demonstrate that speech reverberations from inbuilt loudspeakers, at an appropriate loudness, can impact the accelerometer, leaking sensitive information about the speech. In particular, we show that by exploiting the affected accelerometer readings and carefully selecting feature sets along with off-the-shelf machine learning techniques, Spearphone can successfully perform gender classification (accuracy over 90%) and speaker identification (accuracy over 80%) for any audio/video playback on the smartphone. Our results with testing the attack on a voice call and voice assistant response were also encouraging, showcasing the impact of the proposed attack. In addition, we perform speech recognition and speech reconstruction to extract more information about the eavesdropped speech to an extent. Our work brings to light a fundamental design vulnerability in many currently-deployed smartphones, which may put people's speech privacy at risk while using the smartphone in the loudspeaker mode during phone calls, media playback or voice assistant interactions.