SD AISep 28, 2025

Generalizable Speech Deepfake Detection via Information Bottleneck Enhanced Adversarial Alignment

Pu Huang, Shouguang Wang, Siya Yao, Mengchu Zhou

arXiv:2509.23618v14.0h-index: 21

Originality Highly original

AI Analysis

This addresses security risks from realistic speech deepfakes for applications like authentication and media forensics, representing a strong specific gain rather than a foundational breakthrough.

The paper tackles the challenge of detecting speech deepfakes across diverse spoofing methods and conditions by proposing IB-CAAN, which uses information bottleneck and adversarial alignment to learn robust features, achieving state-of-the-art performance on benchmarks like ASVspoof 2019/2021.

Neural speech synthesis techniques have enabled highly realistic speech deepfakes, posing major security risks. Speech deepfake detection is challenging due to distribution shifts across spoofing methods and variability in speakers, channels, and recording conditions. We explore learning shared discriminative features as a path to robust detection and propose Information Bottleneck enhanced Confidence-Aware Adversarial Network (IB-CAAN). Confidence-guided adversarial alignment adaptively suppresses attack-specific artifacts without erasing discriminative cues, while the information bottleneck removes nuisance variability to preserve transferable features. Experiments on ASVspoof 2019/2021, ASVspoof 5, and In-the-Wild demonstrate that IB-CAAN consistently outperforms baseline and achieves state-of-the-art performance on many benchmarks.

View on arXiv PDF

Similar