CLAIJan 17, 2025

Automatic Speech Recognition for Sanskrit with Transfer Learning

arXiv:2501.10024v11 citationsh-index: 1C3IT
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of improving accessibility and technological support for Sanskrit learning, but it is incremental as it applies an existing method to a new language.

The paper tackles the problem of limited digital content and intricate linguistics for Sanskrit by developing an automatic speech recognition model using transfer learning on OpenAI's Whisper, achieving a word error rate of 15.42% on the Vaksancayah dataset.

Sanskrit, one of humanity's most ancient languages, has a vast collection of books and manuscripts on diverse topics that have been accumulated over millennia. However, its digital content (audio and text), which is vital for the training of AI systems, is profoundly limited. Furthermore, its intricate linguistics make it hard to develop robust NLP tools for wider accessibility. Given these constraints, we have developed an automatic speech recognition model for Sanskrit by employing transfer learning mechanism on OpenAI's Whisper model. After carefully optimising the hyper-parameters, we obtained promising results with our transfer-learned model achieving a word error rate of 15.42% on Vaksancayah dataset. An online demo of our model is made available for the use of public and to evaluate its performance firsthand thereby paving the way for improved accessibility and technological support for Sanskrit learning in the modern era.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes