SD AI ASAug 20, 2024

Does Current Deepfake Audio Detection Model Effectively Detect ALM-based Deepfake Audio?

Yuankun Xie, Chenxu Xiong, Xiaopeng Wang, Zhiyong Wang, Yi Lu, Xin Qi, Ruibo Fu, Yukun Liu, Zhengqi Wen, Jianhua Tao, Guanjun Li, Long Ye

arXiv:2408.10853v16.74 citationsh-index: 26Has Code

Originality Synthesis-oriented

AI Analysis

This addresses the societal threat posed by ALM-based deepfake audio by evaluating detection effectiveness, showing incremental improvements in existing methods.

The paper investigated whether current deepfake audio detection models can effectively detect Audio Language Model (ALM)-based deepfake audio, finding that the latest codec-trained countermeasure achieved a 0% equal error rate under most test conditions.

Currently, Audio Language Models (ALMs) are rapidly advancing due to the developments in large language models and audio neural codecs. These ALMs have significantly lowered the barrier to creating deepfake audio, generating highly realistic and diverse types of deepfake audio, which pose severe threats to society. Consequently, effective audio deepfake detection technologies to detect ALM-based audio have become increasingly critical. This paper investigate the effectiveness of current countermeasure (CM) against ALM-based audio. Specifically, we collect 12 types of the latest ALM-based deepfake audio and utilizing the latest CMs to evaluate. Our findings reveal that the latest codec-trained CM can effectively detect ALM-based audio, achieving 0% equal error rate under most ALM test conditions, which exceeded our expectations. This indicates promising directions for future research in ALM-based deepfake audio detection.

View on arXiv PDF Code

Similar