CVCLSDASIVAug 5, 2022

Hybrid Multimodal Feature Extraction, Mining and Fusion for Sentiment Analysis

arXiv:2208.03051v237 citationsh-index: 48
Originality Incremental advance
AI Analysis

This work addresses sentiment analysis for multimodal data in specific challenge tasks, representing an incremental improvement with strong performance gains.

The paper tackles multimodal sentiment analysis by extracting and fusing acoustic, visual, text, and biological features using TEMMA and GRU with self-attention, achieving top results in three sub-challenges: AUC of 0.8932 for humor detection, Pearson's correlation of 0.3879 for emotional reactions, and a combined score of 0.5151 for stress prediction.

In this paper, we present our solutions for the Multimodal Sentiment Analysis Challenge (MuSe) 2022, which includes MuSe-Humor, MuSe-Reaction and MuSe-Stress Sub-challenges. The MuSe 2022 focuses on humor detection, emotional reactions and multimodal emotional stress utilizing different modalities and data sets. In our work, different kinds of multimodal features are extracted, including acoustic, visual, text and biological features. These features are fused by TEMMA and GRU with self-attention mechanism frameworks. In this paper, 1) several new audio features, facial expression features and paragraph-level text embeddings are extracted for accuracy improvement. 2) we substantially improve the accuracy and reliability of multimodal sentiment prediction by mining and blending the multimodal features. 3) effective data augmentation strategies are applied in model training to alleviate the problem of sample imbalance and prevent the model from learning biased subject characters. For the MuSe-Humor sub-challenge, our model obtains the AUC score of 0.8932. For the MuSe-Reaction sub-challenge, the Pearson's Correlations Coefficient of our approach on the test set is 0.3879, which outperforms all other participants. For the MuSe-Stress sub-challenge, our approach outperforms the baseline in both arousal and valence on the test dataset, reaching a final combined result of 0.5151.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes