Multi-Task Learning with Auxiliary Speaker Identification for Conversational Emotion Recognition
This work addresses speaker-sensitive representation challenges in conversational emotion recognition for NLP applications, but it is incremental as it builds on existing multi-task learning approaches.
The paper tackled conversational emotion recognition by using speaker identification as an auxiliary task to improve utterance representations, achieving new state-of-the-art results on two benchmark datasets.
Conversational emotion recognition (CER) has attracted increasing interests in the natural language processing (NLP) community. Different from the vanilla emotion recognition, effective speaker-sensitive utterance representation is one major challenge for CER. In this paper, we exploit speaker identification (SI) as an auxiliary task to enhance the utterance representation in conversations. By this method, we can learn better speaker-aware contextual representations from the additional SI corpus. Experiments on two benchmark datasets demonstrate that the proposed architecture is highly effective for CER, obtaining new state-of-the-art results on two datasets.