CLLGSDASApr 3, 2024

Mai Ho'omāuna i ka 'Ai: Language Models Improve Automatic Speech Recognition in Hawaiian

arXiv:2404.03073v11 citationsh-index: 7
Originality Synthesis-oriented
AI Analysis

This work addresses the problem of enhancing ASR for underrepresented languages like Hawaiian, though it is incremental as it applies a standard LM rescoring technique to a new language.

The paper tackled the challenge of improving Automatic Speech Recognition (ASR) for the low-resource language Hawaiian by training an external language model on ~1.5M words of Hawaiian text and using it to rescore outputs from the Whisper foundation model, resulting in a small but significant improvement in word error rates (WERs) on a manually curated test set.

In this paper we address the challenge of improving Automatic Speech Recognition (ASR) for a low-resource language, Hawaiian, by incorporating large amounts of independent text data into an ASR foundation model, Whisper. To do this, we train an external language model (LM) on ~1.5M words of Hawaiian text. We then use the LM to rescore Whisper and compute word error rates (WERs) on a manually curated test set of labeled Hawaiian data. As a baseline, we use Whisper without an external LM. Experimental results reveal a small but significant improvement in WER when ASR outputs are rescored with a Hawaiian LM. The results support leveraging all available data in the development of ASR systems for underrepresented languages.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes