SDLGASDec 14, 2021

Real-Time Neural Voice Camouflage

arXiv:2112.07076v29 citations
AI Analysis

This addresses privacy concerns for individuals by enabling effective voice camouflage in streaming situations, though it is incremental as it builds on adversarial attacks with a predictive twist.

The paper tackles the problem of real-time voice camouflage to prevent eavesdropping by automatic speech recognition systems, achieving a 3.9x improvement in word error rate and 6.6x in character error rate over baselines under real-time constraints.

Automatic speech recognition systems have created exciting possibilities for applications, however they also enable opportunities for systematic eavesdropping. We propose a method to camouflage a person's voice over-the-air from these systems without inconveniencing the conversation between people in the room. Standard adversarial attacks are not effective in real-time streaming situations because the characteristics of the signal will have changed by the time the attack is executed. We introduce predictive attacks, which achieve real-time performance by forecasting the attack that will be the most effective in the future. Under real-time constraints, our method jams the established speech recognition system DeepSpeech 3.9x more than baselines as measured through word error rate, and 6.6x more as measured through character error rate. We furthermore demonstrate our approach is practically effective in realistic environments over physical distances.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes