SDAIASAug 20, 2022

Fully Automated End-to-End Fake Audio Detection

arXiv:2208.09618v136 citationsh-index: 41
Originality Incremental advance
AI Analysis

This addresses the challenge of detecting fake audio for security applications, offering an incremental improvement by automating network design.

The paper tackles the problem of fake audio detection by proposing a fully automated end-to-end method that eliminates manual parameter tuning, achieving an equal error rate (EER) of 1.08% on the ASVspoof 2019 LA dataset, outperforming the state-of-the-art single system.

The existing fake audio detection systems often rely on expert experience to design the acoustic features or manually design the hyperparameters of the network structure. However, artificial adjustment of the parameters can have a relatively obvious influence on the results. It is almost impossible to manually set the best set of parameters. Therefore this paper proposes a fully automated end-toend fake audio detection method. We first use wav2vec pre-trained model to obtain a high-level representation of the speech. Furthermore, for the network structure, we use a modified version of the differentiable architecture search (DARTS) named light-DARTS. It learns deep speech representations while automatically learning and optimizing complex neural structures consisting of convolutional operations and residual blocks. The experimental results on the ASVspoof 2019 LA dataset show that our proposed system achieves an equal error rate (EER) of 1.08%, which outperforms the state-of-the-art single system.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes