Solution for Emotion Prediction Competition of Workshop on Emotionally and Culturally Intelligent AI
This work addresses emotion prediction in culturally diverse contexts, though it appears incremental as it builds on existing models like XLM-R and X²-VLM.
The authors tackled the WECIA Emotion Prediction Competition by developing a method to predict emotions from artistic works with comments, addressing modal imbalance and cultural differences in the ArtELingo dataset. Their approach achieved first place with a score of 0.627.
This report provide a detailed description of the method that we explored and proposed in the WECIA Emotion Prediction Competition (EPC), which predicts a person's emotion through an artistic work with a comment. The dataset of this competition is ArtELingo, designed to encourage work on diversity across languages and cultures. The dataset has two main challenges, namely modal imbalance problem and language-cultural differences problem. In order to address this issue, we propose a simple yet effective approach called single-multi modal with Emotion-Cultural specific prompt(ECSP), which focuses on using the single modal message to enhance the performance of multimodal models and a well-designed prompt to reduce cultural differences problem. To clarify, our approach contains two main blocks: (1)XLM-R\cite{conneau2019unsupervised} based unimodal model and X$^2$-VLM\cite{zeng2022x} based multimodal model (2) Emotion-Cultural specific prompt. Our approach ranked first in the final test with a score of 0.627.