CL AI LG ASSep 5, 2023

Leveraging Label Information for Multimodal Emotion Recognition

Peiying Wang, Sunlu Zeng, Junqing Chen, Lu Fan, Meng Chen, Youzheng Wu, Xiaodong He

arXiv:2309.02106v11.36 citationsh-index: 19

Originality Incremental advance

AI Analysis

This work addresses emotion recognition for applications like human-computer interaction, but it is incremental as it builds on existing multimodal methods.

The paper tackled multimodal emotion recognition by leveraging label information to enhance text and speech representations, achieving new state-of-the-art performance on the IEMOCAP dataset.

Multimodal emotion recognition (MER) aims to detect the emotional status of a given expression by combining the speech and text information. Intuitively, label information should be capable of helping the model locate the salient tokens/frames relevant to the specific emotion, which finally facilitates the MER task. Inspired by this, we propose a novel approach for MER by leveraging label information. Specifically, we first obtain the representative label embeddings for both text and speech modalities, then learn the label-enhanced text/speech representations for each utterance via label-token and label-frame interactions. Finally, we devise a novel label-guided attentive fusion module to fuse the label-aware text and speech representations for emotion classification. Extensive experiments were conducted on the public IEMOCAP dataset, and experimental results demonstrate that our proposed approach outperforms existing baselines and achieves new state-of-the-art performance.

View on arXiv PDF

Similar