CVMMNov 23, 2022

Holistic Visual-Textual Sentiment Analysis with Prior Models

arXiv:2211.12981v25 citationsh-index: 19
Originality Incremental advance
AI Analysis

This work addresses visual-textual sentiment analysis, an incremental improvement for applications in social media and content analysis.

The paper tackles the challenge of predicting sentiment from image-text pairs by proposing a holistic method that leverages pre-trained visual and textual prior models, achieving better performance than existing methods on three datasets.

Visual-textual sentiment analysis aims to predict sentiment with the input of a pair of image and text, which poses a challenge in learning effective features for diverse input images. To address this, we propose a holistic method that achieves robust visual-textual sentiment analysis by exploiting a rich set of powerful pre-trained visual and textual prior models. The proposed method consists of four parts: (1) a visual-textual branch to learn features directly from data for sentiment analysis, (2) a visual expert branch with a set of pre-trained "expert" encoders to extract selected semantic visual features, (3) a CLIP branch to implicitly model visual-textual correspondence, and (4) a multimodal feature fusion network based on BERT to fuse multimodal features and make sentiment predictions. Extensive experiments on three datasets show that our method produces better visual-textual sentiment analysis performance than existing methods.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes