CLMar 17, 2024

ChartThinker: A Contextual Chain-of-Thought Approach to Optimized Chart Summarization

arXiv:2403.11236v291 citationsh-index: 18LREC
AI Analysis

This addresses the problem of improving logical coherence and accuracy in automated chart summarization for data analysis applications, representing a strong domain-specific advancement.

The paper tackles deficiencies in chart summarization by constructing a large-scale dataset of chart-caption pairs and proposing ChartThinker, a method combining chain-of-thought reasoning with context retrieval, which outperforms 8 state-of-the-art models across 7 evaluation metrics.

Data visualization serves as a critical means for presenting data and mining its valuable insights. The task of chart summarization, through natural language processing techniques, facilitates in-depth data analysis of charts. However, there still are notable deficiencies in terms of visual-language matching and reasoning ability for existing approaches. To address these limitations, this study constructs a large-scale dataset of comprehensive chart-caption pairs and fine-tuning instructions on each chart. Thanks to the broad coverage of various topics and visual styles within this dataset, better matching degree can be achieved from the view of training data. Moreover, we propose an innovative chart summarization method, ChartThinker, which synthesizes deep analysis based on chains of thought and strategies of context retrieval, aiming to improve the logical coherence and accuracy of the generated summaries. Built upon the curated datasets, our trained model consistently exhibits superior performance in chart summarization tasks, surpassing 8 state-of-the-art models over 7 evaluation metrics. Our dataset and codes are publicly accessible.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes