CL SD ASJun 4, 2025

A Novel Data Augmentation Approach for Automatic Speaking Assessment on Opinion Expressions

Chung-Chun Wang, Jhen-Ke Lin, Hao-Chien Lu, Hong-Yun Lin, Berlin Chen

arXiv:2506.04077v22.7h-index: 1Slate

Originality Incremental advance

AI Analysis

This work addresses low-resource constraints in automated speaking assessment for opinion expressions, enabling more reliable scoring with cross-modal information, though it is incremental in its approach.

The paper tackles the problem of automated speaking assessment on opinion expressions by addressing the scarcity of labeled recordings, proposing a novel training paradigm that uses LLMs and text-to-speech synthesis to generate diverse responses, and achieves improved performance over methods using real data or conventional augmentation on the LTTC dataset.

Automated speaking assessment (ASA) on opinion expressions is often hampered by the scarcity of labeled recordings, which restricts prompt diversity and undermines scoring reliability. To address this challenge, we propose a novel training paradigm that leverages a large language models (LLM) to generate diverse responses of a given proficiency level, converts responses into synthesized speech via speaker-aware text-to-speech synthesis, and employs a dynamic importance loss to adaptively reweight training instances based on feature distribution differences between synthesized and real speech. Subsequently, a multimodal large language model integrates aligned textual features with speech signals to predict proficiency scores directly. Experiments conducted on the LTTC dataset show that our approach outperforms methods relying on real data or conventional augmentation, effectively mitigating low-resource constraints and enabling ASA on opinion expressions with cross-modal information.

View on arXiv PDF

Similar