CR HCNov 30, 2017

VoiceMask: Anonymize and Sanitize Voice Input on Mobile Devices

Jianwei Qian, Haohua Du, Jiahui Hou, Linlin Chen, Taeho Jung, Xiang-Yang Li, Yu Wang, Yanbo Deng

arXiv:1711.11460v116.246 citations

Originality Incremental advance

AI Analysis

This addresses privacy concerns for mobile users by protecting identity and sensitive content from cloud profiling, though it is an incremental improvement on existing sanitization methods.

The paper tackles privacy risks in cloud-based speech recognition by introducing VoiceMask, a mobile app that anonymizes voice data and sanitizes content, reducing voice identification chance by 84% among 50 people while limiting accuracy drop to 14.2%.

Voice input has been tremendously improving the user experience of mobile devices by freeing our hands from typing on the small screen. Speech recognition is the key technology that powers voice input, and it is usually outsourced to the cloud for the best performance. However, the cloud might compromise users' privacy by identifying their identities by voice, learning their sensitive input content via speech recognition, and then profiling the mobile users based on the content. In this paper, we design an intermediate between users and the cloud, named VoiceMask, to sanitize users' voice data before sending it to the cloud for speech recognition. We analyze the potential privacy risks and aim to protect users' identities and sensitive input content from being disclosed to the cloud. VoiceMask adopts a carefully designed voice conversion mechanism that is resistant to several attacks. Meanwhile, it utilizes an evolution-based keyword substitution technique to sanitize the voice input content. The two sanitization phases are all performed in the resource-limited mobile device while still maintaining the usability and accuracy of the cloud-supported speech recognition service. We implement the voice sanitizer on Android systems and present extensive experimental results that validate the effectiveness and efficiency of our app. It is demonstrated that we are able to reduce the chance of a user's voice being identified from 50 people by 84% while keeping the drop of speech recognition accuracy within 14.2%.

View on arXiv PDF

Similar