CLSDASOct 30, 2021

Pseudo-Labeling for Massively Multilingual Speech Recognition

arXiv:2111.00161v336 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving speech recognition for low-resource languages in a multilingual context, though it is incremental as it adapts existing pseudo-labeling methods.

The paper tackled the problem of extending pseudo-labeling to massively multilingual speech recognition across 60 languages, resulting in a model that achieved better performance for many languages and transferred well to LibriSpeech.

Semi-supervised learning through pseudo-labeling has become a staple of state-of-the-art monolingual speech recognition systems. In this work, we extend pseudo-labeling to massively multilingual speech recognition with 60 languages. We propose a simple pseudo-labeling recipe that works well even with low-resource languages: train a supervised multilingual model, fine-tune it with semi-supervised learning on a target language, generate pseudo-labels for that language, and train a final model using pseudo-labels for all languages, either from scratch or by fine-tuning. Experiments on the labeled Common Voice and unlabeled VoxPopuli datasets show that our recipe can yield a model with better performance for many languages that also transfers well to LibriSpeech.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes