Adversarial Attacks on ASR Systems: An Overview
It addresses the security vulnerabilities in ASR systems used in applications like smart speakers and self-driving cars, but is incremental as it is a survey paper.
This paper provides an overview of adversarial attacks on Automatic Speech Recognition (ASR) systems, describing the development of ASR, different attack assumptions, and evaluation methods, with a focus on how perturbations affect waveforms and their implementation.
With the development of hardware and algorithms, ASR(Automatic Speech Recognition) systems evolve a lot. As The models get simpler, the difficulty of development and deployment become easier, ASR systems are getting closer to our life. On the one hand, we often use APPs or APIs of ASR to generate subtitles and record meetings. On the other hand, smart speaker and self-driving car rely on ASR systems to control AIoT devices. In past few years, there are a lot of works on adversarial examples attacks against ASR systems. By adding a small perturbation to the waveforms, the recognition results make a big difference. In this paper, we describe the development of ASR system, different assumptions of attacks, and how to evaluate these attacks. Next, we introduce the current works on adversarial examples attacks from two attack assumptions: white-box attack and black-box attack. Different from other surveys, we pay more attention to which layer they perturb waveforms in ASR system, the relationship between these attacks, and their implementation methods. We focus on the effect of their works.