SDASDec 3, 2019

HI-MIA : A Far-field Text-Dependent Speaker Verification Database and the Baselines

arXiv:1912.01231v375 citations
Originality Synthesis-oriented
AI Analysis

This addresses the data scarcity problem for researchers and developers working on far-field speaker verification systems, though it is incremental as it builds on existing verification methods.

The paper tackles the lack of far-field text-dependent speaker verification data by introducing the HI-MIA database with recordings from 340 people, and proposes baseline systems that achieve 3.29% and 4.02% equal error rates in different testing tasks.

This paper presents a far-field text-dependent speaker verification database named HI-MIA. We aim to meet the data requirement for far-field microphone array based speaker verification since most of the publicly available databases are single channel close-talking and text-independent. The database contains recordings of 340 people in rooms designed for the far-field scenario. Recordings are captured by multiple microphone arrays located in different directions and distance to the speaker and a high-fidelity close-talking microphone. Besides, we propose a set of end-to-end neural network based baseline systems that adopt single-channel data for training. Moreover, we propose a testing background aware enrollment augmentation strategy to further enhance the performance. Results show that the fusion systems could achieve 3.29% EER in the far-field enrollment far field testing task and 4.02% EER in the close-talking enrollment and far-field testing task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes