SD AI CR LG ASMay 30, 2025

Rehearsal with Auxiliary-Informed Sampling for Audio Deepfake Detection

Falih Gozi Febrinanto, Kristen Moore, Chandra Thapa, Jiangang Ma, Vidya Saikrishna, Feng Xia

arXiv:2505.24486v14.01 citationsh-index: 7Has CodeINTERSPEECH

Originality Incremental advance

AI Analysis

This work addresses the challenge of maintaining detection accuracy for audio deepfakes in evolving attack scenarios, representing an incremental improvement over existing rehearsal techniques.

The paper tackles the problem of performance degradation in audio deepfake detection models when faced with new attacks by proposing a rehearsal-based continual learning approach called RAIS, which uses auxiliary labels to guide diverse sample selection, resulting in an average Equal Error Rate of 1.953% across five experiences.

The performance of existing audio deepfake detection frameworks degrades when confronted with new deepfake attacks. Rehearsal-based continual learning (CL), which updates models using a limited set of old data samples, helps preserve prior knowledge while incorporating new information. However, existing rehearsal techniques don't effectively capture the diversity of audio characteristics, introducing bias and increasing the risk of forgetting. To address this challenge, we propose Rehearsal with Auxiliary-Informed Sampling (RAIS), a rehearsal-based CL approach for audio deepfake detection. RAIS employs a label generation network to produce auxiliary labels, guiding diverse sample selection for the memory buffer. Extensive experiments show RAIS outperforms state-of-the-art methods, achieving an average Equal Error Rate (EER) of 1.953 % across five experiences. The code is available at: https://github.com/falihgoz/RAIS.

View on arXiv PDF Code

Similar