CLMar 14, 2024

ChartInstruct: Instruction Tuning for Chart Comprehension and Reasoning

arXiv:2403.09028v180 citationsACL
Originality Incremental advance
AI Analysis

This addresses the need for more versatile chart analysis tools for users in data-driven fields, though it is incremental as it builds on existing vision-language and instruction-tuning methods.

The paper tackles the problem of limited real-world applicability of task-specific models for chart comprehension by introducing ChartInstruct, a novel chart-specific vision-language instruction-following dataset with 191K instructions from 71K charts, and presents two instruction-tuning systems that achieve state-of-the-art results on four downstream tasks.

Charts provide visual representations of data and are widely used for analyzing information, addressing queries, and conveying insights to others. Various chart-related downstream tasks have emerged recently, such as question-answering and summarization. A common strategy to solve these tasks is to fine-tune various models originally trained on vision tasks language. However, such task-specific models are not capable of solving a wide range of chart-related tasks, constraining their real-world applicability. To overcome these challenges, we introduce ChartInstruct: a novel chart-specific vision-language Instruction-following dataset comprising 191K instructions generated with 71K charts. We then present two distinct systems for instruction tuning on such datasets: (1) an end-to-end model that connects a vision encoder for chart understanding with a LLM; and (2) a pipeline model that employs a two-step approach to extract chart data tables and input them into the LLM. In experiments on four downstream tasks, we first show the effectiveness of our model--achieving a new set of state-of-the-art results. Further evaluation shows that our instruction-tuning approach supports a wide array of real-world chart comprehension and reasoning scenarios, thereby expanding the scope and applicability of our models to new kinds of tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes