CLSep 8, 2022

Multilingual Transformer Language Model for Speech Recognition in Low-resource Languages

Microsoft
arXiv:2209.04041v13 citationsh-index: 25
AI Analysis

This addresses the problem of data scarcity and high costs for deploying speech recognition systems in low-resource languages, offering a practical solution for language technology applications.

The paper tackles the challenge of training Transformer language models for speech recognition in low-resource languages by grouping locales together, resulting in improved performance and reduced costs compared to traditional multilingual models.

It is challenging to train and deploy Transformer LMs for hybrid speech recognition 2nd pass re-ranking in low-resource languages due to (1) data scarcity in low-resource languages, (2) expensive computing costs for training and refreshing 100+ monolingual models, and (3) hosting inefficiency considering sparse traffic. In this study, we present a new way to group multiple low-resource locales together and optimize the performance of Multilingual Transformer LMs in ASR. Our Locale-group Multilingual Transformer LMs outperform traditional multilingual LMs along with reducing maintenance costs and operating expenses. Further, for low-resource but high-traffic locales where deploying monolingual models is feasible, we show that fine-tuning our locale-group multilingual LMs produces better monolingual LM candidates than baseline monolingual LMs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes