Automated Measurement of Geniohyoid Muscle Thickness During Speech Using Deep Learning and Ultrasound
This work provides a tool for scalable and objective assessment of speech and swallowing disorders, benefiting researchers and clinicians studying speech motor control by automating a previously time-consuming manual measurement process.
This paper introduces SMMA, an automated framework using deep learning and ultrasound to measure geniohyoid muscle thickness during speech, achieving near-human accuracy (Dice = 0.9037, MAE = 0.53 mm). Applying SMMA to Cantonese vowel production, the authors found that /a:/ exhibited significantly greater geniohyoid thickness (7.29 mm) compared to /i:/ (5.95 mm, p < 0.001).
Manual measurement of muscle morphology from ultrasound during speech is time-consuming and limits large-scale studies. We present SMMA, a fully automated framework that combines deep-learning segmentation with skeleton-based thickness quantification to analyze geniohyoid (GH) muscle dynamics. Validation demonstrates near-human-level accuracy (Dice = 0.9037, MAE = 0.53 mm, r = 0.901). Application to Cantonese vowel production (N = 11) reveals systematic patterns: /a:/ shows significantly greater GH thickness (7.29 mm) than /i:/ (5.95 mm, p < 0.001, Cohen's d > 1.3), suggesting greater GH activation during production of /a:/ than /i:/, consistent with its role in mandibular depression. Sex differences (5-8% greater in males) reflect anatomical scaling. SMMA achieves expert-validated accuracy while eliminating the need for manual annotation, enabling scalable investigations of speech motor control and objective assessment of speech and swallowing disorders.