SD AI ASAug 4, 2025

StutterCut: Uncertainty-Guided Normalised Cut for Dysfluency Segmentation

Suhita Ghosh, Melanie Jouaiti, Jan-Ole Perschewski, Sebastian Stober

arXiv:2508.02255v19.32 citationsh-index: 5INTERSPEECH

Originality Incremental advance

AI Analysis

This addresses the need for precise dysfluency segmentation for speech therapy and real-time feedback, representing an incremental improvement over utterance-level classification methods.

The paper tackles the problem of dysfluency segmentation in speech by introducing StutterCut, a semi-supervised framework that formulates it as a graph partitioning problem, achieving higher F1 scores and more precise onset detection compared to existing methods.

Detecting and segmenting dysfluencies is crucial for effective speech therapy and real-time feedback. However, most methods only classify dysfluencies at the utterance level. We introduce StutterCut, a semi-supervised framework that formulates dysfluency segmentation as a graph partitioning problem, where speech embeddings from overlapping windows are represented as graph nodes. We refine the connections between nodes using a pseudo-oracle classifier trained on weak (utterance-level) labels, with its influence controlled by an uncertainty measure from Monte Carlo dropout. Additionally, we extend the weakly labelled FluencyBank dataset by incorporating frame-level dysfluency boundaries for four dysfluency types. This provides a more realistic benchmark compared to synthetic datasets. Experiments on real and synthetic datasets show that StutterCut outperforms existing methods, achieving higher F1 scores and more precise stuttering onset detection.

View on arXiv PDF

Similar