Multilingual Dysarthric Speech Assessment Using Universal Phone Recognition and Language-Specific Phonemic Contrast Modeling
This work addresses the need for multilingual automated assessment tools for dysarthric speech, which is crucial for clinical applications in neurological disorders, but it is incremental as it builds on existing phoneme-based methods by adding language-specific adaptations.
The paper tackled the problem of automated intelligibility assessment for dysarthric speech across multiple languages by developing a framework that integrates universal phone recognition with language-specific phonemic contrast modeling, resulting in metrics like phoneme error rate (PER), phonological feature error rate (PFER), and phoneme coverage (PhonCov) that show benefits from mapping and alignment techniques in analyses on English, Spanish, Italian, and Tamil.
The growing prevalence of neurological disorders associated with dysarthria motivates the need for automated intelligibility assessment methods that are applicalbe across languages. However, most existing approaches are either limited to a single language or fail to capture language-specific factors shaping intelligibility. We present a multilingual phoneme-production assessment framework that integrates universal phone recognition with language-specific phoneme interpretation using contrastive phonological feature distances for phone-to-phoneme mapping and sequence alignment. The framework yields three metrics: phoneme error rate (PER), phonological feature error rate (PFER), and a newly proposed alignment-free measure, phoneme coverage (PhonCov). Analysis on English, Spanish, Italian, and Tamil show that PER benefits from the combination of mapping and alignment, PFER from alignment alone, and PhonCov from mapping. Further analyses demonstrate that the proposed framework captures clinically meaningful patterns of intelligibility degradation consistent with established observations of dysarthric speech.