CLSDASOct 7, 2021

Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0

arXiv:2110.03560v120 citations
Originality Incremental advance
AI Analysis

This addresses the problem of limited labeled data for ASR in low-resource languages, though it is incremental as it builds on existing wav2vec-2.0 methods.

The paper tackles cross-lingual adaptation of monolingual wav2vec-2.0 models for ASR in resource-scarce languages, showing that the adapted model achieves similar performance to a multilingual XLSR model trained on 53 languages.

We propose a simple and effective cross-lingual transfer learning method to adapt monolingual wav2vec-2.0 models for Automatic Speech Recognition (ASR) in resource-scarce languages. We show that a monolingual wav2vec-2.0 is a good few-shot ASR learner in several languages. We improve its performance further via several iterations of Dropout Uncertainty-Driven Self-Training (DUST) by using a moderate-sized unlabeled speech dataset in the target language. A key finding of this work is that the adapted monolingual wav2vec-2.0 achieves similar performance as the topline multilingual XLSR model, which is trained on fifty-three languages, on the target language ASR task.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes