CLSDASAug 10, 2023

A Novel Self-training Approach for Low-resource Speech Recognition

arXiv:2308.05269v113 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses the scarcity of annotated data for low-resource languages like Punjabi, enabling more accurate ASR systems for millions of speakers, but it is incremental as it adapts existing self-training methods to new languages.

The paper tackles the problem of low-resource speech recognition by proposing a self-training approach, achieving a 14.94% relative improvement in word error rate compared to a baseline across four datasets and reporting the best results on the Common Voice Punjabi dataset.

In this paper, we propose a self-training approach for automatic speech recognition (ASR) for low-resource settings. While self-training approaches have been extensively developed and evaluated for high-resource languages such as English, their applications to low-resource languages like Punjabi have been limited, despite the language being spoken by millions globally. The scarcity of annotated data has hindered the development of accurate ASR systems, especially for low-resource languages (e.g., Punjabi and Māori languages). To address this issue, we propose an effective self-training approach that generates highly accurate pseudo-labels for unlabeled low-resource speech. Our experimental analysis demonstrates that our approach significantly improves word error rate, achieving a relative improvement of 14.94% compared to a baseline model across four real speech datasets. Further, our proposed approach reports the best results on the Common Voice Punjabi dataset.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes