SDAIASAug 4, 2025

StutterCut: Uncertainty-Guided Normalised Cut for Dysfluency Segmentation

arXiv:2508.02255v11 citationsh-index: 5INTERSPEECH
Originality Incremental advance
AI Analysis

This addresses the need for precise dysfluency segmentation for speech therapy and real-time feedback, representing an incremental improvement over utterance-level classification methods.

The paper tackles the problem of dysfluency segmentation in speech by introducing StutterCut, a semi-supervised framework that formulates it as a graph partitioning problem, achieving higher F1 scores and more precise onset detection compared to existing methods.

Detecting and segmenting dysfluencies is crucial for effective speech therapy and real-time feedback. However, most methods only classify dysfluencies at the utterance level. We introduce StutterCut, a semi-supervised framework that formulates dysfluency segmentation as a graph partitioning problem, where speech embeddings from overlapping windows are represented as graph nodes. We refine the connections between nodes using a pseudo-oracle classifier trained on weak (utterance-level) labels, with its influence controlled by an uncertainty measure from Monte Carlo dropout. Additionally, we extend the weakly labelled FluencyBank dataset by incorporating frame-level dysfluency boundaries for four dysfluency types. This provides a more realistic benchmark compared to synthetic datasets. Experiments on real and synthetic datasets show that StutterCut outperforms existing methods, achieving higher F1 scores and more precise stuttering onset detection.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes