CLLGMar 25, 2021

Real-time low-resource phoneme recognition on edge devices

arXiv:2103.13997v12 citations
Originality Incremental advance
AI Analysis

This addresses the issue of limited speech recognition models for non-English languages, making it accessible for edge device applications like mobile phones and car displays.

The paper tackles the problem of speech recognition for languages lacking large datasets by developing a method to create models that are highly accurate, require minimal storage, memory, and training data, enabling deployment on edge devices for real-time use.

While speech recognition has seen a surge in interest and research over the last decade, most machine learning models for speech recognition either require large training datasets or lots of storage and memory. Combined with the prominence of English as the number one language in which audio data is available, this means most other languages currently lack good speech recognition models. The method presented in this paper shows how to create and train models for speech recognition in any language which are not only highly accurate, but also require very little storage, memory and training data when compared with traditional models. This allows training models to recognize any language and deploying them on edge devices such as mobile phones or car displays for fast real-time speech recognition.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes