AS CL SDJul 13, 2019

BUT VOiCES 2019 System Description

Hossein Zeinali, Pavel Matějka, Ladislav Mošner, Oldřich Plchot, Anna Silnova, Ondřej Novotný, Ján Profant, Ondřej Glembek, Lukáš Burget

arXiv:1907.06112v12.31 citations

Originality Synthesis-oriented

AI Analysis

This is an incremental improvement in speaker recognition for a specific challenge, with limited broader impact.

The paper tackled the VOiCES 2019 Speaker Recognition challenge, achieving a 1.0% EER with a fusion of three systems, which is a 15% relative improvement over the single best system.

This is a description of our effort in VOiCES 2019 Speaker Recognition challenge. All systems in the fixed condition are based on the x-vector paradigm with different features and DNN topologies. The single best system reaches 1.2% EER and a fusion of 3 systems yields 1.0% EER, which is 15% relative improvement. The open condition allowed us to use external data which we did for the PLDA adaptation and achieved less than ~10% relative improvement. In the submission to open condition, we used 3 x-vector systems and also one i-vector based system.

View on arXiv PDF

Similar