VIS-Shepherd: Constructing Critic for LLM-based Data Visualization Generation
This addresses the need for automated improvement in LLM-based data visualization generation, offering a domain-specific incremental advancement.
The paper tackles the problem of suboptimal data visualizations generated by LLMs by introducing VIS-Shepherd, a specialized MLLM-based critic that provides feedback, resulting in small open-source models achieving performance gains comparable to larger or proprietary models.
Data visualization generation using Large Language Models (LLMs) has shown promising results but often produces suboptimal visualizations that require human intervention for improvement. In this work, we introduce VIS-Shepherd, a specialized Multimodal Large Language Model (MLLM)-based critic to evaluate and provide feedback for LLM-generated data visualizations. At the core of our approach is a framework to construct a high-quality visualization critique dataset, where we collect human-created visualization instances, synthesize corresponding LLM-generated instances, and construct high-quality critiques. We conduct both model-based automatic evaluation and human preference studies to evaluate the effectiveness of our approach. Our experiments show that even small (7B parameters) open-source MLLM models achieve substantial performance gains by leveraging our high-quality visualization critique dataset, reaching levels comparable to much larger open-source or even proprietary models. Our work demonstrates significant potential for MLLM-based automated visualization critique and indicates promising directions for enhancing LLM-based data visualization generation. Our project page: https://github.com/bopan3/VIS-Shepherd.