CLASOct 5, 2023

Evaluating Self-Supervised Speech Representations for Indigenous American Languages

arXiv:2310.03639v285 citationsh-index: 8
Originality Synthesis-oriented
AI Analysis

This work addresses the lack of representation for indigenous languages in speech technology, though it is incremental as it applies existing methods to new data.

The paper tackled the problem of evaluating self-supervised speech representations for indigenous American languages, which are often overlooked in research, and found that state-of-the-art SSL models achieve surprisingly strong performance on low-resource ASR tasks for languages like Quechua, Guarani, and Bribri.

The application of self-supervision to speech representation learning has garnered significant interest in recent years, due to its scalability to large amounts of unlabeled data. However, much progress, both in terms of pre-training and downstream evaluation, has remained concentrated in monolingual models that only consider English. Few models consider other languages, and even fewer consider indigenous ones. In our submission to the New Language Track of the ASRU 2023 ML-SUPERB Challenge, we present an ASR corpus for Quechua, an indigenous South American Language. We benchmark the efficacy of large SSL models on Quechua, along with 6 other indigenous languages such as Guarani and Bribri, on low-resource ASR. Our results show surprisingly strong performance by state-of-the-art SSL models, showing the potential generalizability of large-scale models to real-world data.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes