CLAIOct 18, 2023

Learning Co-Speech Gesture for Multimodal Aphasia Type Detection

arXiv:2310.11710v2131 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses the need for accurate aphasia type detection to improve treatment, focusing on a domain-specific medical application with incremental novelty.

The paper tackles the problem of detecting specific types of aphasia by proposing a multimodal graph neural network that uses speech and co-speech gestures, achieving state-of-the-art results with an F1 score of 84.2%.

Aphasia, a language disorder resulting from brain damage, requires accurate identification of specific aphasia types, such as Broca's and Wernicke's aphasia, for effective treatment. However, little attention has been paid to developing methods to detect different types of aphasia. Recognizing the importance of analyzing co-speech gestures for distinguish aphasia types, we propose a multimodal graph neural network for aphasia type detection using speech and corresponding gesture patterns. By learning the correlation between the speech and gesture modalities for each aphasia type, our model can generate textual representations sensitive to gesture information, leading to accurate aphasia type detection. Extensive experiments demonstrate the superiority of our approach over existing methods, achieving state-of-the-art results (F1 84.2\%). We also show that gesture features outperform acoustic features, highlighting the significance of gesture expression in detecting aphasia types. We provide the codes for reproducibility purposes.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes