CL SD ASJul 8, 2021

Improved Language Identification Through Cross-Lingual Self-Supervised Learning

Andros Tjandra, Diptanu Gon Choudhury, Frank Zhang, Kritika Singh, Alexis Conneau, Alexei Baevski, Assaf Sela, Yatharth Saraf, Michael Auli

arXiv:2107.04082v43.638 citations

Originality Incremental advance

AI Analysis

This improves language identification for downstream speech tasks, but it is incremental as it extends existing self-supervised methods to a multi-lingual context.

The paper tackles language identification by leveraging cross-lingual self-supervised learning, showing that models pre-trained on multiple languages achieve over 89.2% accuracy with only 10 minutes of labeled data per language.

Language identification greatly impacts the success of downstream tasks such as automatic speech recognition. Recently, self-supervised speech representations learned by wav2vec 2.0 have been shown to be very effective for a range of speech tasks. We extend previous self-supervised work on language identification by experimenting with pre-trained models which were learned on real-world unconstrained speech in multiple languages and not just on English. We show that models pre-trained on many languages perform better and enable language identification systems that require very little labeled data to perform well. Results on a 26 languages setup show that with only 10 minutes of labeled data per language, a cross-lingually pre-trained model can achieve over 89.2% accuracy.

View on arXiv PDF

Similar