CLNov 4, 2022

Late Fusion with Triplet Margin Objective for Multimodal Ideology Prediction and Analysis

Changyuan Qiu, Winston Wu, Xinliang Frederick Zhang, Lu Wang

arXiv:2211.02269v123.9290 citationsh-index: 20

Originality Incremental advance

AI Analysis

This work addresses ideology prediction for political content analysis by extending it to multimodal data, representing an incremental advance with domain-specific impact.

The paper tackles multimodal ideology prediction by introducing new datasets and a late-fusion model with triplet margin objective, which outperforms the state-of-the-art text-only model by almost 4% and a strong multimodal baseline by over 3%.

Prior work on ideology prediction has largely focused on single modalities, i.e., text or images. In this work, we introduce the task of multimodal ideology prediction, where a model predicts binary or five-point scale ideological leanings, given a text-image pair with political content. We first collect five new large-scale datasets with English documents and images along with their ideological leanings, covering news articles from a wide range of US mainstream media and social media posts from Reddit and Twitter. We conduct in-depth analyses of news articles and reveal differences in image content and usage across the political spectrum. Furthermore, we perform extensive experiments and ablation studies, demonstrating the effectiveness of targeted pretraining objectives on different model components. Our best-performing model, a late-fusion architecture pretrained with a triplet objective over multimodal content, outperforms the state-of-the-art text-only model by almost 4% and a strong multimodal baseline with no pretraining by over 3%.

View on arXiv PDF

Similar