SDLGASOct 5, 2023

An Integrated Algorithm for Robust and Imperceptible Audio Adversarial Examples

arXiv:2310.03349v11 citationsh-index: 4
AI Analysis

This work addresses the challenge of creating more realistic and stealthy audio adversarial attacks for automatic speech recognition systems, which is an incremental advancement in the field of adversarial machine learning.

The paper tackles the problem of generating audio adversarial examples that are both robust to over-the-air attacks and imperceptible to humans, by integrating psychoacoustic models and simulated room impulse responses into the generation process. The result shows improvements in signal-to-noise ratio and human perception, though at the cost of increased word error rate.

Audio adversarial examples are audio files that have been manipulated to fool an automatic speech recognition (ASR) system, while still sounding benign to a human listener. Most methods to generate such samples are based on a two-step algorithm: first, a viable adversarial audio file is produced, then, this is fine-tuned with respect to perceptibility and robustness. In this work, we present an integrated algorithm that uses psychoacoustic models and room impulse responses (RIR) in the generation step. The RIRs are dynamically created by a neural network during the generation process to simulate a physical environment to harden our examples against transformations experienced in over-the-air attacks. We compare the different approaches in three experiments: in a simulated environment and in a realistic over-the-air scenario to evaluate the robustness, and in a human study to evaluate the perceptibility. Our algorithms considering psychoacoustics only or in addition to the robustness show an improvement in the signal-to-noise ratio (SNR) as well as in the human perception study, at the cost of an increased word error rate (WER).

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes