SD CR CV LG ASOct 22, 2020

Class-Conditional Defense GAN Against End-to-End Speech Attacks

Mohammad Esmaeilpour, Patrick Cardinal, Alessandro Lameiras Koerich

arXiv:2010.11352v29.314 citations

Originality Incremental advance

AI Analysis

This addresses security vulnerabilities in speech recognition systems like DeepSpeech and Lingvo, representing a novel but incremental improvement over existing defense approaches.

The paper tackles the problem of defending speech-to-text systems against end-to-end adversarial attacks by proposing a class-conditional defense GAN that reconstructs signals without adding extra noise, resulting in improved word error rate and sentence-level accuracy compared to conventional methods.

In this paper we propose a novel defense approach against end-to-end adversarial attacks developed to fool advanced speech-to-text systems such as DeepSpeech and Lingvo. Unlike conventional defense approaches, the proposed approach does not directly employ low-level transformations such as autoencoding a given input signal aiming at removing potential adversarial perturbation. Instead of that, we find an optimal input vector for a class conditional generative adversarial network through minimizing the relative chordal distance adjustment between a given test input and the generator network. Then, we reconstruct the 1D signal from the synthesized spectrogram and the original phase information derived from the given input signal. Hence, this reconstruction does not add any extra noise to the signal and according to our experimental results, our defense-GAN considerably outperforms conventional defense algorithms both in terms of word error rate and sentence level recognition accuracy.

View on arXiv PDF

Similar