CVAIMMApr 5, 2023

ChartReader: A Unified Framework for Chart Derendering and Comprehension without Heuristic Rules

CMUUW
arXiv:2304.02173v134 citationsh-index: 39Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of automating chart analysis for data scientists and analysts, reducing manual effort, though it is incremental as it builds on transformer-based and vision-language models.

The paper tackles the problem of chart comprehension by introducing ChartReader, a unified framework that integrates chart derendering and comprehension without heuristic rules, achieving superior performance on Chart-to-Table, ChartQA, and Chart-to-Text tasks compared to existing methods.

Charts are a powerful tool for visually conveying complex data, but their comprehension poses a challenge due to the diverse chart types and intricate components. Existing chart comprehension methods suffer from either heuristic rules or an over-reliance on OCR systems, resulting in suboptimal performance. To address these issues, we present ChartReader, a unified framework that seamlessly integrates chart derendering and comprehension tasks. Our approach includes a transformer-based chart component detection module and an extended pre-trained vision-language model for chart-to-X tasks. By learning the rules of charts automatically from annotated datasets, our approach eliminates the need for manual rule-making, reducing effort and enhancing accuracy.~We also introduce a data variable replacement technique and extend the input and position embeddings of the pre-trained model for cross-task training. We evaluate ChartReader on Chart-to-Table, ChartQA, and Chart-to-Text tasks, demonstrating its superiority over existing methods. Our proposed framework can significantly reduce the manual effort involved in chart analysis, providing a step towards a universal chart understanding model. Moreover, our approach offers opportunities for plug-and-play integration with mainstream LLMs such as T5 and TaPas, extending their capability to chart comprehension tasks. The code is available at https://github.com/zhiqic/ChartReader.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes