CLLGSDASNov 26, 2020

Unsupervised Domain Adaptation for Speech Recognition via Uncertainty Driven Self-Training

arXiv:2011.13439v273 citations
AI Analysis

This work provides a strong specific gain for ASR systems facing domain mismatch, which is a common problem for developers deploying ASR models in diverse environments.

This paper addresses the performance degradation of ASR systems when training and test data domains are mismatched. The authors propose DUST, a dropout-based uncertainty-driven self-training technique that filters out high-uncertainty pseudo-labeled data, recovering up to 80% of the performance of a system trained on ground-truth data.

The performance of automatic speech recognition (ASR) systems typically degrades significantly when the training and test data domains are mismatched. In this paper, we show that self-training (ST) combined with an uncertainty-based pseudo-label filtering approach can be effectively used for domain adaptation. We propose DUST, a dropout-based uncertainty-driven self-training technique which uses agreement between multiple predictions of an ASR system obtained for different dropout settings to measure the model's uncertainty about its prediction. DUST excludes pseudo-labeled data with high uncertainties from the training, which leads to substantially improved ASR results compared to ST without filtering, and accelerates the training time due to a reduced training data set. Domain adaptation experiments using WSJ as a source domain and TED-LIUM 3 as well as SWITCHBOARD as the target domains show that up to 80% of the performance of a system trained on ground-truth data can be recovered.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes