SD AI ASSep 17, 2025

Deploying UDM Series in Real-Life Stuttered Speech Applications: A Clinical Evaluation Framework

Eric Zhang, Li Wei, Sarah Chen, Michael Wang

arXiv:2509.14304v17.01 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of clinical adoption for AI in speech therapy by providing an interpretable system with high accuracy, though it appears incremental as it builds on an existing framework.

The paper tackled the trade-off between accuracy and interpretability in stuttered speech detection by evaluating the Unconstrained Dysfluency Modeling (UDM) series, achieving state-of-the-art performance with an F1 score of 0.89±0.04 and a 34% reduction in diagnostic time.

Stuttered and dysfluent speech detection systems have traditionally suffered from the trade-off between accuracy and clinical interpretability. While end-to-end deep learning models achieve high performance, their black-box nature limits clinical adoption. This paper looks at the Unconstrained Dysfluency Modeling (UDM) series-the current state-of-the-art framework developed by Berkeley that combines modular architecture, explicit phoneme alignment, and interpretable outputs for real-world clinical deployment. Through extensive experiments involving patients and certified speech-language pathologists (SLPs), we demonstrate that UDM achieves state-of-the-art performance (F1: 0.89+-0.04) while providing clinically meaningful interpretability scores (4.2/5.0). Our deployment study shows 87% clinician acceptance rate and 34% reduction in diagnostic time. The results provide strong evidence that UDM represents a practical pathway toward AI-assisted speech therapy in clinical environments.

View on arXiv PDF

Similar