CLJul 20, 2025

Doc2Chart: Intent-Driven Zero-Shot Chart Generation from Documents

Akriti Jain, Pritika Ramu, Aparna Garimella, Apoorv Saxena

arXiv:2507.14819v26.72 citationsh-index: 15EMNLP

Originality Incremental advance

AI Analysis

This addresses the challenge of automating data visualization from documents without manual content selection, which is incremental as it builds on existing LLM capabilities for chart generation.

The paper tackles the problem of generating charts from long documents based on user intents in a zero-shot setting, proposing a two-staged framework that outperforms baselines by up to 9 points in chart data accuracy and 17 points in chart type selection.

Large Language Models (LLMs) have demonstrated strong capabilities in transforming text descriptions or tables to data visualizations via instruction-tuning methods. However, it is not straightforward to apply these methods directly for a more real-world use case of visualizing data from long documents based on user-given intents, as opposed to the user pre-selecting the relevant content manually. We introduce the task of intent-based chart generation from documents: given a user-specified intent and document(s), the goal is to generate a chart adhering to the intent and grounded on the document(s) in a zero-shot setting. We propose an unsupervised, two-staged framework in which an LLM first extracts relevant information from the document(s) by decomposing the intent and iteratively validates and refines this data. Next, a heuristic-guided module selects an appropriate chart type before final code generation. To assess the data accuracy of the generated charts, we propose an attribution-based metric that uses a structured textual representation of charts, instead of relying on visual decoding metrics that often fail to capture the chart data effectively. To validate our approach, we curate a dataset comprising of 1,242 $<$intent, document, charts$>$ tuples from two domains, finance and scientific, in contrast to the existing datasets that are largely limited to parallel text descriptions/ tables and their corresponding charts. We compare our approach with baselines using single-shot chart generation using LLMs and query-based retrieval methods; our method outperforms by upto $9$ points and $17$ points in terms of chart data accuracy and chart type respectively over the best baselines.

View on arXiv PDF

Similar