HCCLCVOct 8, 2025

GPT-5 Model Corrected GPT-4V's Chart Reading Errors, Not Prompting

arXiv:2510.06782v1
Originality Incremental advance
AI Analysis

This work addresses chart reading accuracy for users of multimodal AI systems, but it is incremental as it focuses on comparing two specific models.

The study tackled the problem of chart reading errors by comparing the inference accuracies of GPT-5 and GPT-4V on 107 visualization questions, finding that GPT-5 significantly improved accuracy over GPT-4V, with prompt variants having minimal effects.

We present a quantitative evaluation to understand the effect of zero-shot large-language model (LLMs) and prompting uses on chart reading tasks. We asked LLMs to answer 107 visualization questions to compare inference accuracies between the agentic GPT-5 and multimodal GPT-4V, for difficult image instances, where GPT-4V failed to produce correct answers. Our results show that model architecture dominates the inference accuracy: GPT5 largely improved accuracy, while prompt variants yielded only small effects. Pre-registration of this work is available here: https://osf.io/u78td/?view_only=6b075584311f48e991c39335c840ded3; the Google Drive materials are here:https://drive.google.com/file/d/1ll8WWZDf7cCNcfNWrLViWt8GwDNSvVrp/view.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes