CLJul 2, 2025

Chart Question Answering from Real-World Analytical Narratives

Maeve Hutchinson, Radu Jianu, Aidan Slingsby, Jo Wood, Pranava Madhyastha

arXiv:2507.01627v18.32 citationsh-index: 37ACL

Originality Synthesis-oriented

AI Analysis

This addresses the need for more realistic chart question answering benchmarks for researchers and practitioners, though it is incremental as it builds on existing CQA work with a new dataset.

The authors tackled the problem of chart question answering by creating a new dataset from real-world visualization notebooks, featuring multi-view charts and natural language questions grounded in analytical narratives. Benchmarking showed a significant performance gap, with GPT-4.1 achieving only 69.3% accuracy, highlighting the challenges of this authentic setting.

We present a new dataset for chart question answering (CQA) constructed from visualization notebooks. The dataset features real-world, multi-view charts paired with natural language questions grounded in analytical narratives. Unlike prior benchmarks, our data reflects ecologically valid reasoning workflows. Benchmarking state-of-the-art multimodal large language models reveals a significant performance gap, with GPT-4.1 achieving an accuracy of 69.3%, underscoring the challenges posed by this more authentic CQA setting.

View on arXiv PDF

Similar