CVSep 3, 2024Code
EvoChart: A Benchmark and a Self-Training Approach Towards Real-World Chart UnderstandingMuye Huang, Han Lai, Xinyu Zhang et al.
Chart understanding enables automated data analysis for humans, which requires models to achieve highly accurate visual comprehension. While existing Visual Language Models (VLMs) have shown progress in chart understanding, the lack of high-quality training data and comprehensive evaluation benchmarks hinders VLM chart comprehension. In this paper, we introduce EvoChart, a novel self-training method for generating synthetic chart data to enhance VLMs' capabilities in real-world chart comprehension. We also propose EvoChart-QA, a noval benchmark for measuring models' chart comprehension abilities in real-world scenarios. Specifically, EvoChart is a unique self-training data synthesis approach that simultaneously produces high-quality training corpus and a high-performance chart understanding model. EvoChart-QA consists of 650 distinct real-world charts collected from 140 different websites and 1,250 expert-curated questions that focus on chart understanding. Experimental results on various open-source and proprietary VLMs tested on EvoChart-QA demonstrate that even the best proprietary model, GPT-4o, achieves only 49.8% accuracy. Moreover, the EvoChart method significantly boosts the performance of open-source VLMs on real-world chart understanding tasks, achieving 54.2% accuracy on EvoChart-QA.
SEOct 11, 2012Code
A lightweight forum-based distributed requirement elicitation process for open source communityHan Lai, Rong Peng, Dong Sun et al.
Nowadays, lots of open source communities adopt forum to acquire scattered stakeholders' requirements. But the requirements collection process always suffers from the unformatted description and unfocused discussions. In this paper, we establish a framework ReqForum to define the metamodel of the requirement elicitation forum. Based on it, we propose a lightweight forum-based requirements elicitation process which includes six steps: template-based requirements creation, opinions collection, requirements collection, requirements management, capability identification and the incentive mechanism. According to the proposed process, the prototype SKLSEForum is established by composing the Discuz and its existed pulg-ins. The implementation indicates that the process is feasible and the cost is economic.
CVMay 25, 2025
ChartSketcher: Reasoning with Multimodal Feedback and Reflection for Chart UnderstandingMuye Huang, Lingling Zhang, Jie Ma et al.
Charts are high-density visualization carriers for complex data, serving as a crucial medium for information extraction and analysis. Automated chart understanding poses significant challenges to existing multimodal large language models (MLLMs) due to the need for precise and complex visual reasoning. Current step-by-step reasoning models primarily focus on text-based logical reasoning for chart understanding. However, they struggle to refine or correct their reasoning when errors stem from flawed visual understanding, as they lack the ability to leverage multimodal interaction for deeper comprehension. Inspired by human cognitive behavior, we propose ChartSketcher, a multimodal feedback-driven step-by-step reasoning method designed to address these limitations. ChartSketcher is a chart understanding model that employs Sketch-CoT, enabling MLLMs to annotate intermediate reasoning steps directly onto charts using a programmatic sketching library, iteratively feeding these visual annotations back into the reasoning process. This mechanism enables the model to visually ground its reasoning and refine its understanding over multiple steps. We employ a two-stage training strategy: a cold start phase to learn sketch-based reasoning patterns, followed by off-policy reinforcement learning to enhance reflection and generalization. Experiments demonstrate that ChartSketcher achieves promising performance on chart understanding benchmarks and general vision tasks, providing an interactive and interpretable approach to chart comprehension.