LG AI CL IRMay 23, 2025

Evidence-Grounded Multimodal Misinformation Detection with Attention-Based GNNs

Sharad Duwal, Mir Nafis Sharear Shopnil, Abhishek Tyagi, Adiba Mahbub Proma

arXiv:2505.18221v11 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses misinformation detection for social media users and platforms, though it is an incremental improvement over existing methods.

The paper tackles multimodal out-of-context misinformation detection by proposing a graph-based method that constructs evidence and claim graphs from online evidence and captions, then uses GNNs to evaluate consistency, achieving 93.05% accuracy and outperforming the best LLM by 2.82%.

Multimodal out-of-context (OOC) misinformation is misinformation that repurposes real images with unrelated or misleading captions. Detecting such misinformation is challenging because it requires resolving the context of the claim before checking for misinformation. Many current methods, including LLMs and LVLMs, do not perform this contextualization step. LLMs hallucinate in absence of context or parametric knowledge. In this work, we propose a graph-based method that evaluates the consistency between the image and the caption by constructing two graph representations: an evidence graph, derived from online textual evidence, and a claim graph, from the claim in the caption. Using graph neural networks (GNNs) to encode and compare these representations, our framework then evaluates the truthfulness of image-caption pairs. We create datasets for our graph-based method, evaluate and compare our baseline model against popular LLMs on the misinformation detection task. Our method scores $93.05\%$ detection accuracy on the evaluation set and outperforms the second-best performing method (an LLM) by $2.82\%$, making a case for smaller and task-specific methods.

View on arXiv PDF

Similar