Automated Measurement of Geniohyoid Muscle Thickness During Speech Using Deep Learning and Ultrasound

Alisher Myrgyyassov, Bruce Xiao Wang, Yu Sun, Shuming Huang, Zhen Song, Min Ney Wong, Yongping Zheng

arXiv:2603.03350v11.2h-index: 9

Originality Incremental advance

AI Analysis

This work provides a tool for scalable and objective assessment of speech and swallowing disorders, benefiting researchers and clinicians studying speech motor control by automating a previously time-consuming manual measurement process.

This paper introduces SMMA, an automated framework using deep learning and ultrasound to measure geniohyoid muscle thickness during speech, achieving near-human accuracy (Dice = 0.9037, MAE = 0.53 mm). Applying SMMA to Cantonese vowel production, the authors found that /a:/ exhibited significantly greater geniohyoid thickness (7.29 mm) compared to /i:/ (5.95 mm, p < 0.001).

Manual measurement of muscle morphology from ultrasound during speech is time-consuming and limits large-scale studies. We present SMMA, a fully automated framework that combines deep-learning segmentation with skeleton-based thickness quantification to analyze geniohyoid (GH) muscle dynamics. Validation demonstrates near-human-level accuracy (Dice = 0.9037, MAE = 0.53 mm, r = 0.901). Application to Cantonese vowel production (N = 11) reveals systematic patterns: /a:/ shows significantly greater GH thickness (7.29 mm) than /i:/ (5.95 mm, p < 0.001, Cohen's d > 1.3), suggesting greater GH activation during production of /a:/ than /i:/, consistent with its role in mandibular depression. Sex differences (5-8% greater in males) reflect anatomical scaling. SMMA achieves expert-validated accuracy while eliminating the need for manual annotation, enabling scalable investigations of speech motor control and objective assessment of speech and swallowing disorders.

View on arXiv PDF

Similar