SD LG ASDec 14, 2021

Real-Time Neural Voice Camouflage

Mia Chiquier, Chengzhi Mao, Carl Vondrick

arXiv:2112.07076v211.79 citations

Originality Incremental advance

AI Analysis

This addresses privacy concerns for individuals by enabling effective voice camouflage in streaming situations, though it is incremental as it builds on adversarial attacks with a predictive twist.

The paper tackles the problem of real-time voice camouflage to prevent eavesdropping by automatic speech recognition systems, achieving a 3.9x improvement in word error rate and 6.6x in character error rate over baselines under real-time constraints.

Automatic speech recognition systems have created exciting possibilities for applications, however they also enable opportunities for systematic eavesdropping. We propose a method to camouflage a person's voice over-the-air from these systems without inconveniencing the conversation between people in the room. Standard adversarial attacks are not effective in real-time streaming situations because the characteristics of the signal will have changed by the time the attack is executed. We introduce predictive attacks, which achieve real-time performance by forecasting the attack that will be the most effective in the future. Under real-time constraints, our method jams the established speech recognition system DeepSpeech 3.9x more than baselines as measured through word error rate, and 6.6x more as measured through character error rate. We furthermore demonstrate our approach is practically effective in realistic environments over physical distances.

View on arXiv PDF

Similar