CL AI SD ASOct 27, 2022

Automatic Severity Classification of Dysarthric speech by using Self-supervised Model with Multi-task Learning

Eun Jung Yeo, Kwanghee Choi, Sunhee Kim, Minhwa Chung

CMU

arXiv:2210.15387v31.614 citationsh-index: 13Has Code

Originality Incremental advance

AI Analysis

This work addresses data scarcity in dysarthric speech assessment for clinical rehabilitation, but it is incremental as it builds on existing self-supervised models with multi-task learning.

The paper tackled the problem of automatic severity classification of dysarthric speech by proposing a method using a self-supervised model with multi-task learning, achieving a 1.25% relative increase in F1-score over traditional baselines and a 10.61% improvement over a model without multi-task learning.

Automatic assessment of dysarthric speech is essential for sustained treatments and rehabilitation. However, obtaining atypical speech is challenging, often leading to data scarcity issues. To tackle the problem, we propose a novel automatic severity assessment method for dysarthric speech, using the self-supervised model in conjunction with multi-task learning. Wav2vec 2.0 XLS-R is jointly trained for two different tasks: severity classification and auxiliary automatic speech recognition (ASR). For the baseline experiments, we employ hand-crafted acoustic features and machine learning classifiers such as SVM, MLP, and XGBoost. Explored on the Korean dysarthric speech QoLT database, our model outperforms the traditional baseline methods, with a relative percentage increase of 1.25% for F1-score. In addition, the proposed model surpasses the model trained without ASR head, achieving 10.61% relative percentage improvements. Furthermore, we present how multi-task learning affects the severity classification performance by analyzing the latent representations and regularization effect.

View on arXiv PDF Code

Similar