SDASSep 5, 2021

The SpeakIn System for VoxCeleb Speaker Recognition Challange 2021

arXiv:2109.01989v167 citations
Originality Synthesis-oriented
AI Analysis

This work addresses speaker recognition for audio verification tasks, but it is incremental as it builds on existing methods for a specific competition.

The authors tackled speaker verification in the VoxCeleb Speaker Recognition Challenge 2021 by developing a system that fused 9 models with data augmentation and fine-tuning, achieving first place with a minDCF of 0.1034 and EER of 1.8460%.

This report describes our submission to the track 1 and track 2 of the VoxCeleb Speaker Recognition Challenge 2021 (VoxSRC 2021). Both track 1 and track 2 share the same speaker verification system, which only uses VoxCeleb2-dev as our training set. This report explores several parts, including data augmentation, network structures, domain-based large margin fine-tuning, and back-end refinement. Our system is a fusion of 9 models and achieves first place in these two tracks of VoxSRC 2021. The minDCF of our submission is 0.1034, and the corresponding EER is 1.8460%.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes