CL LG SD ASApr 1, 2019

ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks

Cheng-I Lai, Nanxin Chen, Jesús Villalba, Najim Dehak

arXiv:1904.01120v16.1179 citations

Originality Incremental advance

AI Analysis

This work addresses anti-spoofing for speech authentication systems, presenting an incremental improvement with strong specific gains.

The paper tackled the problem of anti-spoofing in speech systems by proposing ASSERT, a DNN-based pipeline for detecting spoofing attacks like text-to-speech and replay, achieving over 93% and 17% relative improvements over baselines in ASVspoof 2019 sub-challenges.

We present JHU's system submission to the ASVspoof 2019 Challenge: Anti-Spoofing with Squeeze-Excitation and Residual neTworks (ASSERT). Anti-spoofing has gathered more and more attention since the inauguration of the ASVspoof Challenges, and ASVspoof 2019 dedicates to address attacks from all three major types: text-to-speech, voice conversion, and replay. Built upon previous research work on Deep Neural Network (DNN), ASSERT is a pipeline for DNN-based approach to anti-spoofing. ASSERT has four components: feature engineering, DNN models, network optimization and system combination, where the DNN models are variants of squeeze-excitation and residual networks. We conducted an ablation study of the effectiveness of each component on the ASVspoof 2019 corpus, and experimental results showed that ASSERT obtained more than 93% and 17% relative improvements over the baseline systems in the two sub-challenges in ASVspooof 2019, ranking ASSERT one of the top performing systems. Code and pretrained models will be made publicly available.

View on arXiv PDF

Similar