AS SDMay 16, 2020

The INTERSPEECH 2020 Far-Field Speaker Verification Challenge

Xiaoyi Qin, Ming Li, Hui Bu, Wei Rao, Rohan Kumar Das, Shrikanth Narayanan, Haizhou Li

arXiv:2005.08046v113.454 citations

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of speaker verification in real-life far-field scenarios for researchers and practitioners, but it is incremental as it builds on existing methods with a new benchmark.

The paper tackles the problem of far-field speaker verification under cross-channel conditions by introducing a challenge with three tasks and a baseline system, achieving minDCFs of 0.62-0.66 and EERs of 6.27-7.18% across tasks.

The INTERSPEECH 2020 Far-Field Speaker Verification Challenge (FFSVC 2020) addresses three different research problems under well-defined conditions: far-field text-dependent speaker verification from single microphone array, far-field text-independent speaker verification from single microphone array, and far-field text-dependent speaker verification from distributed microphone arrays. All three tasks pose a cross-channel challenge to the participants. To simulate the real-life scenario, the enrollment utterances are recorded from close-talk cellphone, while the test utterances are recorded from the far-field microphone arrays. In this paper, we describe the database, the challenge, and the baseline system, which is based on a ResNet-based deep speaker network with cosine similarity scoring. For a given utterance, the speaker embeddings of different channels are equally averaged as the final embedding. The baseline system achieves minDCFs of 0.62, 0.66, and 0.64 and EERs of 6.27%, 6.55%, and 7.18% for task 1, task 2, and task 3, respectively.

View on arXiv PDF

Similar