CVSep 26, 2025

Effectiveness of Large Multimodal Models in Detecting Disinformation: Experimental Results

arXiv:2509.22377v12 citationsh-index: 8

Originality Synthesis-oriented

AI Analysis

This addresses the challenge of disinformation on digital platforms for users and platforms, but it is incremental as it applies an existing model to a new task with methodological enhancements.

This study tackled the problem of detecting multimodal disinformation by leveraging GPT-4o, achieving a comprehensive performance analysis across multiple datasets like Gossipcop and Politifact, with results highlighting the model's strengths and limitations in classification.

The proliferation of disinformation, particularly in multimodal contexts combining text and images, presents a significant challenge across digital platforms. This study investigates the potential of large multimodal models (LMMs) in detecting and mitigating false information. We propose to approach multimodal disinformation detection by leveraging the advanced capabilities of the GPT-4o model. Our contributions include: (1) the development of an optimized prompt incorporating advanced prompt engineering techniques to ensure precise and consistent evaluations; (2) the implementation of a structured framework for multimodal analysis, including a preprocessing methodology for images and text to comply with the model's token limitations; (3) the definition of six specific evaluation criteria that enable a fine-grained classification of content, complemented by a self-assessment mechanism based on confidence levels; (4) a comprehensive performance analysis of the model across multiple heterogeneous datasets Gossipcop, Politifact, Fakeddit, MMFakeBench, and AMMEBA highlighting GPT-4o's strengths and limitations in disinformation detection; (5) an investigation of prediction variability through repeated testing, evaluating the stability and reliability of the model's classifications; and (6) the introduction of confidence-level and variability-based evaluation methods. These contributions provide a robust and reproducible methodological framework for automated multimodal disinformation analysis.

View on arXiv PDF

Similar