SDASFeb 9, 2022

CAU_KU team's submission to ADD 2022 Challenge task 1: Low-quality fake audio detection through frequency feature masking

arXiv:2202.04328v1
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of detecting synthetic audio in low-quality conditions for security and forensics applications, but it is incremental as it builds on existing models with new augmentation.

The paper tackled low-quality fake audio detection by proposing a frequency feature masking augmentation technique applied to spectrogram-based models, achieving a 23.8% equal error rate and ranking 3rd in the ADD 2022 Challenge.

This technical report describes Chung-Ang University and Korea University (CAU_KU) team's model participating in the Audio Deep Synthesis Detection (ADD) 2022 Challenge, track 1: Low-quality fake audio detection. For track 1, we propose a frequency feature masking (FFM) augmentation technique to deal with a low-quality audio environment. %detection that spectrogram-based models can be applied. We applied FFM and mixup augmentation on five spectrogram-based deep neural network architectures that performed well for spoofing detection using mel-spectrogram and constant Q transform (CQT) features. Our best submission achieved 23.8% of EER ranked 3rd on track 1.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes