ASCLSDMar 4, 2020

Real-time, Universal, and Robust Adversarial Attacks Against Speaker Recognition Systems

arXiv:2003.02301v2110 citations
AI Analysis

This addresses security vulnerabilities in voice user interfaces for applications requiring speaker identification, representing a novel attack method rather than an incremental improvement.

The paper tackles the problem of attacking speaker recognition systems by proposing a real-time, universal, and robust adversarial attack that adds audio-agnoustic perturbations to voice inputs, achieving over 90% attack success rate and a 100X speedup in launching time.

As the popularity of voice user interface (VUI) exploded in recent years, speaker recognition system has emerged as an important medium of identifying a speaker in many security-required applications and services. In this paper, we propose the first real-time, universal, and robust adversarial attack against the state-of-the-art deep neural network (DNN) based speaker recognition system. Through adding an audio-agnostic universal perturbation on arbitrary enrolled speaker's voice input, the DNN-based speaker recognition system would identify the speaker as any target (i.e., adversary-desired) speaker label. In addition, we improve the robustness of our attack by modeling the sound distortions caused by the physical over-the-air propagation through estimating room impulse response (RIR). Experiment using a public dataset of 109 English speakers demonstrates the effectiveness and robustness of our proposed attack with a high attack success rate of over 90%. The attack launching time also achieves a 100X speedup over contemporary non-universal attacks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes