CL CV GRAug 29, 2025

Is this chart lying to me? Automating the detection of misleading visualizations

Jonathan Tonglet, Jan Zimny, Tinne Tuytelaars, Iryna Gurevych

arXiv:2508.21675v18.33 citationsh-index: 76

Originality Synthesis-oriented

AI Analysis

This addresses the spread of misinformation on social media and the web by automating detection of misleading charts, though it is incremental as it builds on prior work by providing new datasets.

The paper tackles the problem of detecting misleading visualizations by introducing Misviz, a benchmark of 2,604 real-world visualizations annotated with 12 types of misleaders, and Misviz-synth, a synthetic dataset of 81,814 visualizations, and finds that the task remains highly challenging for state-of-the-art models.

Misleading visualizations are a potent driver of misinformation on social media and the web. By violating chart design principles, they distort data and lead readers to draw inaccurate conclusions. Prior work has shown that both humans and multimodal large language models (MLLMs) are frequently deceived by such visualizations. Automatically detecting misleading visualizations and identifying the specific design rules they violate could help protect readers and reduce the spread of misinformation. However, the training and evaluation of AI models has been limited by the absence of large, diverse, and openly available datasets. In this work, we introduce Misviz, a benchmark of 2,604 real-world visualizations annotated with 12 types of misleaders. To support model training, we also release Misviz-synth, a synthetic dataset of 81,814 visualizations generated using Matplotlib and based on real-world data tables. We perform a comprehensive evaluation on both datasets using state-of-the-art MLLMs, rule-based systems, and fine-tuned classifiers. Our results reveal that the task remains highly challenging. We release Misviz, Misviz-synth, and the accompanying code.

View on arXiv PDF

Similar