CLFeb 22, 2024

Multi-modal Stance Detection: New Datasets and Model

arXiv:2402.14298v331 citationsh-index: 15ACL
Originality Incremental advance
AI Analysis

This addresses the problem of identifying public opinion from multi-modal social media content for researchers and practitioners, but it is incremental as it extends existing text-based methods to include images.

The paper tackles multi-modal stance detection from tweets containing text and images by creating five new datasets and proposing a Targeted Multi-modal Prompt Tuning (TMPT) framework, which achieves state-of-the-art performance on these benchmarks.

Stance detection is a challenging task that aims to identify public opinion from social media platforms with respect to specific targets. Previous work on stance detection largely focused on pure texts. In this paper, we study multi-modal stance detection for tweets consisting of texts and images, which are prevalent in today's fast-growing social media platforms where people often post multi-modal messages. To this end, we create five new multi-modal stance detection datasets of different domains based on Twitter, in which each example consists of a text and an image. In addition, we propose a simple yet effective Targeted Multi-modal Prompt Tuning framework (TMPT), where target information is leveraged to learn multi-modal stance features from textual and visual modalities. Experimental results on our five benchmark datasets show that the proposed TMPT achieves state-of-the-art performance in multi-modal stance detection.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes