CLLGSDASApr 1, 2019

ASSERT: Anti-Spoofing with Squeeze-Excitation and Residual neTworks

arXiv:1904.01120v1179 citations
Originality Incremental advance
AI Analysis

This work addresses anti-spoofing for speech authentication systems, presenting an incremental improvement with strong specific gains.

The paper tackled the problem of anti-spoofing in speech systems by proposing ASSERT, a DNN-based pipeline for detecting spoofing attacks like text-to-speech and replay, achieving over 93% and 17% relative improvements over baselines in ASVspoof 2019 sub-challenges.

We present JHU's system submission to the ASVspoof 2019 Challenge: Anti-Spoofing with Squeeze-Excitation and Residual neTworks (ASSERT). Anti-spoofing has gathered more and more attention since the inauguration of the ASVspoof Challenges, and ASVspoof 2019 dedicates to address attacks from all three major types: text-to-speech, voice conversion, and replay. Built upon previous research work on Deep Neural Network (DNN), ASSERT is a pipeline for DNN-based approach to anti-spoofing. ASSERT has four components: feature engineering, DNN models, network optimization and system combination, where the DNN models are variants of squeeze-excitation and residual networks. We conducted an ablation study of the effectiveness of each component on the ASVspoof 2019 corpus, and experimental results showed that ASSERT obtained more than 93% and 17% relative improvements over the baseline systems in the two sub-challenges in ASVspooof 2019, ranking ASSERT one of the top performing systems. Code and pretrained models will be made publicly available.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes