CLSDASSep 18, 2025

Speech Language Models for Under-Represented Languages: Insights from Wolof

arXiv:2509.15362v22 citationsh-index: 19
Originality Synthesis-oriented
AI Analysis

This work addresses the lack of speech processing resources for underrepresented languages like Wolof, though it is incremental in applying existing methods to a new language.

The authors tackled the problem of speech processing for underrepresented languages by developing a speech language model for Wolof, showing that their approach outperforms existing models on automatic speech recognition and performs well on speech translation tasks.

We present our journey in training a speech language model for Wolof, an underrepresented language spoken in West Africa, and share key insights. We first emphasize the importance of collecting large-scale, spontaneous, high-quality unsupervised speech data, and show that continued pretraining HuBERT on this dataset outperforms both the base model and African-centric models on ASR. We then integrate this speech encoder into a Wolof LLM to train the first Speech LLM for this language, extending its capabilities to tasks such as speech translation. Furthermore, we explore training the Speech LLM to perform multi-step Chain-of-Thought before transcribing or translating. Our results show that the Speech LLM not only improves speech recognition but also performs well in speech translation. The models and the code will be openly shared.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes