CLFeb 10

Where Are We At with Automatic Speech Recognition for the Bambara Language?

arXiv:2602.09785v1h-index: 2
Originality Synthesis-oriented
AI Analysis

This work addresses the lack of evaluation standards for ASR in underrepresented languages like Bambara, though it is incremental as it focuses on a narrow formal domain.

The paper introduces the first standardized benchmark for Automatic Speech Recognition (ASR) in the Bambara language, revealing that current models perform poorly, with the best Word Error Rate at 46.76% and Character Error Rate at 13.00%, indicating insufficient progress for deployment.

This paper introduces the first standardized benchmark for evaluating Automatic Speech Recognition (ASR) in the Bambara language, utilizing one hour of professionally recorded Malian constitutional text. Designed as a controlled reference set under near-optimal acoustic and linguistic conditions, the benchmark was used to evaluate 37 models, ranging from Bambara-trained systems to large-scale commercial models. Our findings reveal that current ASR performance remains significantly below deployment standards in a narrow formal domain; the top-performing system in terms of Word Error Rate (WER) achieved 46.76\% and the best Character Error Rate (CER) of 13.00\% was set by another model, while several prominent multilingual models exceeded 100\% WER. These results suggest that multilingual pre-training and model scaling alone are insufficient for underrepresented languages. Furthermore, because this dataset represents a best-case scenario of the most simplified and formal form of spoken Bambara, these figures are yet to be tested against practical, real-world settings. We provide the benchmark and an accompanying public leaderboard to facilitate transparent evaluation and future research in Bambara speech technology.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes