SDAIASMay 20, 2025

Replay Attacks Against Audio Deepfake Detection

arXiv:2505.14862v215 citationsh-index: 8Has CodeINTERSPEECH
Originality Incremental advance
AI Analysis

This work addresses a critical security issue for audio deepfake detection systems, exposing a major vulnerability that could impact applications relying on such technology, though it is incremental as it builds on existing detection methods.

The paper tackles the problem of replay attacks undermining audio deepfake detection by showing that playing and re-recording deepfake audio through various speakers and microphones makes spoofed samples appear authentic, resulting in a significant vulnerability where the top-performing model's Equal Error Rate surged from 4.7% to 18.2%.

We show how replay attacks undermine audio deepfake detection: By playing and re-recording deepfake audio through various speakers and microphones, we make spoofed samples appear authentic to the detection model. To study this phenomenon in more detail, we introduce ReplayDF, a dataset of recordings derived from M-AILABS and MLAAD, featuring 109 speaker-microphone combinations across six languages and four TTS models. It includes diverse acoustic conditions, some highly challenging for detection. Our analysis of six open-source detection models across five datasets reveals significant vulnerability, with the top-performing W2V2-AASIST model's Equal Error Rate (EER) surging from 4.7% to 18.2%. Even with adaptive Room Impulse Response (RIR) retraining, performance remains compromised with an 11.0% EER. We release ReplayDF for non-commercial research use.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes