ASAILGJun 2, 2025

Dhvani: A Weakly-supervised Phonemic Error Detection and Personalized Feedback System for Hindi

arXiv:2506.02166v11 citationsh-index: 2INTERSPEECH
Originality Incremental advance
AI Analysis

This addresses a critical gap in pronunciation tools for Hindi learners, but it is incremental as it adapts existing CAPT concepts to a new language.

The paper tackles the lack of Computer-Assisted Pronunciation Training (CAPT) systems for Hindi, a major Indian language with over 600 million speakers, by proposing Dhvani, a system that detects phonemic errors and provides personalized feedback, achieving results through synthetic speech generation and a novel feedback methodology.

Computer-Assisted Pronunciation Training (CAPT) has been extensively studied for English. However, there remains a critical gap in its application to Indian languages with a base of 1.5 billion speakers. Pronunciation tools tailored to Indian languages are strikingly lacking despite the fact that millions learn them every year. With over 600 million speakers and being the fourth most-spoken language worldwide, improving Hindi pronunciation is a vital first step toward addressing this gap. This paper proposes 1) Dhvani -- a novel CAPT system for Hindi, 2) synthetic speech generation for Hindi mispronunciations, and 3) a novel methodology for providing personalized feedback to learners. While the system often interacts with learners using Devanagari graphemes, its core analysis targets phonemic distinctions, leveraging Hindi's highly phonetic orthography to analyze mispronounced speech and provide targeted feedback.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes