CLSDASMay 7, 2023

Learning Robust Self-attention Features for Speech Emotion Recognition with Label-adaptive Mixup

arXiv:2305.06273v113 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of improving emotion recognition accuracy in human-machine interactions, representing an incremental advancement in SER.

The paper tackled the challenge of ambiguous human emotions in Speech Emotion Recognition (SER) by proposing a self-attention method with label-adaptive mixup and center loss, achieving superior performance compared to state-of-the-art methods.

Speech Emotion Recognition (SER) is to recognize human emotions in a natural verbal interaction scenario with machines, which is considered as a challenging problem due to the ambiguous human emotions. Despite the recent progress in SER, state-of-the-art models struggle to achieve a satisfactory performance. We propose a self-attention based method with combined use of label-adaptive mixup and center loss. By adapting label probabilities in mixup and fitting center loss to the mixup training scheme, our proposed method achieves a superior performance to the state-of-the-art methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes