SDAIApr 9, 2025

Detect All-Type Deepfake Audio: Wavelet Prompt Tuning for Enhanced Auditory Perception

arXiv:2504.06753v112 citationsh-index: 15Has Code
Originality Incremental advance
AI Analysis

This addresses the threat of malicious deepfake audio to multimedia security and trust, offering a more universal detection approach, though it is incremental as it builds on existing self-supervised learning methods.

The paper tackled the problem of detecting deepfake audio across multiple types (speech, sound, singing voice, and music) by introducing a wavelet prompt tuning method, achieving an average equal error rate of 3.58%.

The rapid advancement of audio generation technologies has escalated the risks of malicious deepfake audio across speech, sound, singing voice, and music, threatening multimedia security and trust. While existing countermeasures (CMs) perform well in single-type audio deepfake detection (ADD), their performance declines in cross-type scenarios. This paper is dedicated to studying the alltype ADD task. We are the first to comprehensively establish an all-type ADD benchmark to evaluate current CMs, incorporating cross-type deepfake detection across speech, sound, singing voice, and music. Then, we introduce the prompt tuning self-supervised learning (PT-SSL) training paradigm, which optimizes SSL frontend by learning specialized prompt tokens for ADD, requiring 458x fewer trainable parameters than fine-tuning (FT). Considering the auditory perception of different audio types,we propose the wavelet prompt tuning (WPT)-SSL method to capture type-invariant auditory deepfake information from the frequency domain without requiring additional training parameters, thereby enhancing performance over FT in the all-type ADD task. To achieve an universally CM, we utilize all types of deepfake audio for co-training. Experimental results demonstrate that WPT-XLSR-AASIST achieved the best performance, with an average EER of 3.58% across all evaluation sets. The code is available online.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes