ASSDFeb 2, 2022

The CORAL++ Algorithm for Unsupervised Domain Adaptation of Speaker Recogntion

arXiv:2202.01092v128 citations
Originality Synthesis-oriented
AI Analysis

This work addresses domain mismatch for speaker recognition systems in real-world applications, representing an incremental improvement over existing methods.

The paper tackles domain mismatch in speaker recognition by proposing CORAL++, an unsupervised domain adaptation algorithm that improves upon CORAL, achieving a 9.40% relative reduction in EER on the NIST 2019 SRE benchmark.

State-of-the-art speaker recognition systems are trained with a large amount of human-labeled training data set. Such a training set is usually composed of various data sources to enhance the modeling capability of models. However, in practical deployment, unseen condition is almost inevitable. Domain mismatch is a common problem in real-life applications due to the statistical difference between the training and testing data sets. To alleviate the degradation caused by domain mismatch, we propose a new feature-based unsupervised domain adaptation algorithm. The algorithm we propose is a further optimization based on the well-known CORrelation ALignment (CORAL), so we call it CORAL++. On the NIST 2019 Speaker Recognition Evaluation (SRE19), we use SRE18 CTS set as the development set to verify the effectiveness of CORAL++. With the typical x-vector/PLDA setup, the CORAL++ outperforms the CORAL by 9.40% relatively on EER.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes