Audio Adversarial Examples: Targeted Attacks on Speech-to-Text
This introduces a new domain for studying adversarial examples, impacting security in speech recognition systems.
The authors tackled the problem of creating targeted adversarial attacks on speech-to-text systems, achieving a 100% success rate in making audio waveforms over 99.9% similar to originals transcribe as any chosen phrase.
We construct targeted audio adversarial examples on automatic speech recognition. Given any audio waveform, we can produce another that is over 99.9% similar, but transcribes as any phrase we choose (recognizing up to 50 characters per second of audio). We apply our white-box iterative optimization-based attack to Mozilla's implementation DeepSpeech end-to-end, and show it has a 100% success rate. The feasibility of this attack introduce a new domain to study adversarial examples.