Multimodal Sentiment Analysis: Addressing Key Issues and Setting up the Baselines
This work provides a foundational benchmark for researchers in multimodal sentiment analysis, though it is incremental in nature.
The paper tackles multimodal sentiment analysis by establishing baselines and exploring three deep-learning architectures that improve sequentially, evaluated on multiple datasets with fixed partitions. It also addresses frequently ignored issues like speaker-exclusive models and modality importance, setting a new benchmark for future research.
We compile baselines, along with dataset split, for multimodal sentiment analysis. In this paper, we explore three different deep-learning based architectures for multimodal sentiment classification, each improving upon the previous. Further, we evaluate these architectures with multiple datasets with fixed train/test partition. We also discuss some major issues, frequently ignored in multimodal sentiment analysis research, e.g., role of speaker-exclusive models, importance of different modalities, and generalizability. This framework illustrates the different facets of analysis to be considered while performing multimodal sentiment analysis and, hence, serves as a new benchmark for future research in this emerging field.