CLHCLGOct 7, 2023

Integrating Contrastive Learning into a Multitask Transformer Model for Effective Domain Adaptation

arXiv:2310.04703v1h-index: 43
Originality Incremental advance
AI Analysis

This work addresses the challenge of domain adaptation in speech emotion recognition, which is crucial for real-world applications, but it appears incremental as it builds on existing multitask and transformer methods.

The paper tackles the problem of generalizing speech emotion recognition across different datasets by proposing a domain adaptation technique that integrates contrastive learning and information maximization into a multitask transformer model, achieving state-of-the-art performance in cross-corpus scenarios on datasets like IEMOCAP and MSP-IMPROV.

While speech emotion recognition (SER) research has made significant progress, achieving generalization across various corpora continues to pose a problem. We propose a novel domain adaptation technique that embodies a multitask framework with SER as the primary task, and contrastive learning and information maximisation loss as auxiliary tasks, underpinned by fine-tuning of transformers pre-trained on large language models. Empirical results obtained through experiments on well-established datasets like IEMOCAP and MSP-IMPROV, illustrate that our proposed model achieves state-of-the-art performance in SER within cross-corpus scenarios.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes