AIAug 25, 2025

A Comparative Study of Controllability, Explainability, and Performance in Dysfluency Detection Models

Eric Zhang, Li Wei, Sarah Chen, Michael Wang

arXiv:2509.00058v13.3

Originality Synthesis-oriented

AI Analysis

This study addresses the need for clinically viable dysfluency detection models by highlighting trade-offs in controllability and explainability, though it is incremental as it compares existing approaches without introducing new methods.

The paper conducted a systematic comparative analysis of four dysfluency detection models (YOLO-Stutter, FluentNet, UDM, and SSDM) to evaluate their performance, controllability, and explainability, finding that UDM achieved the best balance of accuracy and clinical interpretability.

Recent advances in dysfluency detection have introduced a variety of modeling paradigms, ranging from lightweight object-detection inspired networks (YOLOStutter) to modular interpretable frameworks (UDM). While performance on benchmark datasets continues to improve, clinical adoption requires more than accuracy: models must be controllable and explainable. In this paper, we present a systematic comparative analysis of four representative approaches--YOLO-Stutter, FluentNet, UDM, and SSDM--along three dimensions: performance, controllability, and explainability. Through comprehensive evaluation on multiple datasets and expert clinician assessment, we find that YOLO-Stutter and FluentNet provide efficiency and simplicity, but with limited transparency; UDM achieves the best balance of accuracy and clinical interpretability; and SSDM, while promising, could not be fully reproduced in our experiments. Our analysis highlights the trade-offs among competing approaches and identifies future directions for clinically viable dysfluency modeling. We also provide detailed implementation insights and practical deployment considerations for each approach.

View on arXiv PDF

Similar