AS AI CL SDJun 5, 2025

Seamless Dysfluent Speech Text Alignment for Disordered Speech Analysis

Zongli Ye, Jiachen Lian, Xuanru Zhou, Jinming Zhang, Haodong Li, Shuhe Li, Chenxu Guo, Anaisha Das, Peter Park, Zoe Ezzes, Jet Vonk, Brittany Morin

arXiv:2506.12073v16.69 citationsh-index: 97INTERSPEECH

Originality Incremental advance

AI Analysis

This work addresses a domain-specific problem for automating diagnosis of speech disorders, with incremental improvements in modeling phoneme similarities.

The paper tackled the problem of aligning dysfluent speech with intended text for diagnosing neurodegenerative speech disorders, and the result was that Neural LCS significantly outperformed state-of-the-art models in alignment accuracy and dysfluent speech segmentation on simulated and real PPA data.

Accurate alignment of dysfluent speech with intended text is crucial for automating the diagnosis of neurodegenerative speech disorders. Traditional methods often fail to model phoneme similarities effectively, limiting their performance. In this work, we propose Neural LCS, a novel approach for dysfluent text-text and speech-text alignment. Neural LCS addresses key challenges, including partial alignment and context-aware similarity mapping, by leveraging robust phoneme-level modeling. We evaluate our method on a large-scale simulated dataset, generated using advanced data simulation techniques, and real PPA data. Neural LCS significantly outperforms state-of-the-art models in both alignment accuracy and dysfluent speech segmentation. Our results demonstrate the potential of Neural LCS to enhance automated systems for diagnosing and analyzing speech disorders, offering a more accurate and linguistically grounded solution for dysfluent speech alignment.

View on arXiv PDF

Similar