CLMay 15

Can Large Language Models Imitate Human Speech for Clinical Assessment? LLM-Driven Data Augmentation for Cognitive Score Prediction

Si-Belkacem Yamine Ketir, Lenard Paulo Tamayo, Shohei Hisada, Shaowen Peng, Shoko Wakamiya, Eiji Aramaki

arXiv:2605.1607779.3

Predicted impact top 72% in CL · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses class imbalance and data scarcity in clinical speech analysis for cognitive assessment, offering a practical augmentation method for low-resource settings.

The paper proposes an LLM-driven data augmentation framework using GPT-5 to generate synthetic oral monologues from written responses, improving cognitive score prediction from speech. The similarity-guided selection strategy reduces prediction error for minority low-score participants by up to 30% while maintaining performance for the majority group.

Accurate assessment of cognitive decline from spontaneous speech remains challenging due to limited dataset size and class imbalance. In this work, we propose a large language model (LLM)-driven data augmentation framework to improve the prediction of cognitive scores from speech. Experiments are conducted on a Japanese corpus in which each participant provides both a spontaneous oral narrative and a written response to the same clinical prompt. The written responses serve as semantic anchors to generate multiple oral-like monologues in different styles using GPT-5. We then predict Hasegawa Dementia Scale scores, a widely used cognitive screening tool in Japan, using a Partial Least Squares regression model trained on Sentence-BERT speech embeddings. We investigate two augmentation strategies: random class-balanced selection, which yields moderate but unstable improvements, and similarity-guided class-balanced selection. The latter prioritizes semantically close synthetic samples, leading to more consistent improvements and substantially reducing prediction error for minority low-score participants while maintaining performance for the majority group. Overall, our findings demonstrate the potential of semantically guided LLM-driven augmentation as a principled approach for addressing class imbalance and improving data efficiency in clinical speech analysis.

View on arXiv PDF

Similar