CLSDASJun 20, 2022

The Makerere Radio Speech Corpus: A Luganda Radio Corpus for Automatic Speech Recognition

arXiv:2206.09790v1586 citationsh-index: 15Has Code
Originality Synthesis-oriented
AI Analysis

This addresses the problem of building usable radio monitoring ASR systems for under-resourced languages like Luganda, where radio is a key communication medium, though it is incremental as it provides a new dataset rather than a novel method.

The researchers tackled the lack of transcribed speech datasets for under-resourced languages by releasing the Makerere Radio Speech Corpus, a 155-hour Luganda radio dataset, and achieved baseline automatic speech recognition performance using the Coqui STT toolkit.

Building a usable radio monitoring automatic speech recognition (ASR) system is a challenging task for under-resourced languages and yet this is paramount in societies where radio is the main medium of public communication and discussions. Initial efforts by the United Nations in Uganda have proved how understanding the perceptions of rural people who are excluded from social media is important in national planning. However, these efforts are being challenged by the absence of transcribed speech datasets. In this paper, The Makerere Artificial Intelligence research lab releases a Luganda radio speech corpus of 155 hours. To our knowledge, this is the first publicly available radio dataset in sub-Saharan Africa. The paper describes the development of the voice corpus and presents baseline Luganda ASR performance results using Coqui STT toolkit, an open source speech recognition toolkit.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes