LG AS MLOct 22, 2019

Adversarial Example Detection by Classification for Deep Speech Recognition

Saeid Samizade, Zheng-Hua Tan, Chao Shen, Xiaohong Guan

arXiv:1910.10013v17.738 citations

Originality Incremental advance

AI Analysis

This work addresses the vulnerability of speech recognition systems to adversarial attacks, offering a detection method that is incremental as it builds on existing classification approaches but focuses on specific attack types.

The authors tackled the problem of detecting adversarial examples in deep speech recognition systems by formulating defense as a classification task, generating datasets for white-box and black-box attacks, and training a CNN with cepstral features, achieving accurate detection for known attacks but with significant performance degradation for unknown ones.

Machine Learning systems are vulnerable to adversarial attacks and will highly likely produce incorrect outputs under these attacks. There are white-box and black-box attacks regarding to adversary's access level to the victim learning algorithm. To defend the learning systems from these attacks, existing methods in the speech domain focus on modifying input signals and testing the behaviours of speech recognizers. We, however, formulate the defense as a classification problem and present a strategy for systematically generating adversarial example datasets: one for white-box attacks and one for black-box attacks, containing both adversarial and normal examples. The white-box attack is a gradient-based method on Baidu DeepSpeech with the Mozilla Common Voice database while the black-box attack is a gradient-free method on a deep model-based keyword spotting system with the Google Speech Command dataset. The generated datasets are used to train a proposed Convolutional Neural Network (CNN), together with cepstral features, to detect adversarial examples. Experimental results show that, it is possible to accurately distinct between adversarial and normal examples for known attacks, in both single-condition and multi-condition training settings, while the performance degrades dramatically for unknown attacks. The adversarial datasets and the source code are made publicly available.

View on arXiv PDF

Similar