CLCVLGDec 11, 2024

Rethinking Comprehensive Benchmark for Chart Understanding: A Perspective from Scientific Literature

arXiv:2412.12150v17 citationsh-index: 4
Originality Incremental advance
AI Analysis

This addresses the need for more realistic evaluation of chart understanding models in scientific literature, though it is incremental as it builds on existing benchmark efforts.

The authors tackled the problem of evaluating multimodal models on complex scientific charts by introducing the SCI-CQA benchmark, which includes 37,607 high-quality charts and 5,629 questions, revealing that existing benchmarks inflate performance and that contextual information is crucial for accurate understanding.

Scientific Literature charts often contain complex visual elements, including multi-plot figures, flowcharts, structural diagrams and etc. Evaluating multimodal models using these authentic and intricate charts provides a more accurate assessment of their understanding abilities. However, existing benchmarks face limitations: a narrow range of chart types, overly simplistic template-based questions and visual elements, and inadequate evaluation methods. These shortcomings lead to inflated performance scores that fail to hold up when models encounter real-world scientific charts. To address these challenges, we introduce a new benchmark, Scientific Chart QA (SCI-CQA), which emphasizes flowcharts as a critical yet often overlooked category. To overcome the limitations of chart variety and simplistic visual elements, we curated a dataset of 202,760 image-text pairs from 15 top-tier computer science conferences papers over the past decade. After rigorous filtering, we refined this to 37,607 high-quality charts with contextual information. SCI-CQA also introduces a novel evaluation framework inspired by human exams, encompassing 5,629 carefully curated questions, both objective and open-ended. Additionally, we propose an efficient annotation pipeline that significantly reduces data annotation costs. Finally, we explore context-based chart understanding, highlighting the crucial role of contextual information in solving previously unanswerable questions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes