Exploiting Diverse Feature for Multimodal Sentiment Analysis
This work addresses the problem of personalized sentiment prediction for participants in multimodal analysis, but it is incremental as it focuses on feature extraction and ensemble methods for a specific challenge.
The paper tackled the challenge of predicting continuous arousal and valence values from multimodal data by exploiting diverse features to build robust representations, achieving third place in the MuSe-Personalisation sub-challenge with CCC scores of 0.8492 for arousal and 0.8439 for valence.
In this paper, we present our solution to the MuSe-Personalisation sub-challenge in the MuSe 2023 Multimodal Sentiment Analysis Challenge. The task of MuSe-Personalisation aims to predict the continuous arousal and valence values of a participant based on their audio-visual, language, and physiological signal modalities data. Considering different people have personal characteristics, the main challenge of this task is how to build robustness feature presentation for sentiment prediction. To address this issue, we propose exploiting diverse features. Specifically, we proposed a series of feature extraction methods to build a robust representation and model ensemble. We empirically evaluate the performance of the utilized method on the officially provided dataset. \textbf{As a result, we achieved 3rd place in the MuSe-Personalisation sub-challenge.} Specifically, we achieve the results of 0.8492 and 0.8439 for MuSe-Personalisation in terms of arousal and valence CCC.