CVMar 29, 2025

RefChartQA: Grounding Visual Answer on Chart Images through Instruction Tuning

arXiv:2503.23131v27 citationsh-index: 10Has CodeICDAR
Originality Incremental advance
AI Analysis

This addresses the under-explored challenge of visual grounding in chart images for better human-computer interaction and accessibility, representing a domain-specific advancement.

The paper tackles the problem of chart understanding by introducing RefChartQA, a benchmark that combines Chart Question Answering with visual grounding to identify supporting visual elements, and shows that incorporating spatial awareness improves response accuracy by over 15% in experiments with state-of-the-art VLMs.

Recently, Vision Language Models (VLMs) have increasingly emphasized document visual grounding to achieve better human-computer interaction, accessibility, and detailed understanding. However, its application to visualizations such as charts remains under-explored due to the inherent complexity of interleaved visual-numerical relationships in chart images. Existing chart understanding methods primarily focus on answering questions without explicitly identifying the visual elements that support their predictions. To bridge this gap, we introduce RefChartQA, a novel benchmark that integrates Chart Question Answering (ChartQA) with visual grounding, enabling models to refer elements at multiple granularities within chart images. Furthermore, we conduct a comprehensive evaluation by instruction-tuning 5 state-of-the-art VLMs across different categories. Our experiments demonstrate that incorporating spatial awareness via grounding improves response accuracy by over 15%, reducing hallucinations, and improving model reliability. Additionally, we identify key factors influencing text-spatial alignment, such as architectural improvements in TinyChart, which leverages a token-merging module for enhanced feature fusion. Our dataset is open-sourced for community development and further advancements. All models and code will be publicly available at https://github.com/moured/RefChartQA.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes