LGAICVAug 28, 2024

AutoGeo: Automating Geometric Image Dataset Creation for Enhanced Geometry Understanding

arXiv:2409.09039v133 citationsh-index: 22
Originality Highly original
AI Analysis

This addresses a critical gap in geometric data availability for AI research, enabling enhanced geometry understanding in education and research tools.

The paper tackles the lack of high-quality geometric datasets for AI by introducing AutoGeo, which automatically generates 100k geometric image-text pairs, leading to significant improvements in multimodal models' accuracy on tasks like geometric captioning and reasoning.

With the rapid advancement of large language models, there has been a growing interest in their capabilities in mathematical reasoning. However, existing research has primarily focused on text-based algebra problems, neglecting the study of geometry due to the lack of high-quality geometric datasets. To address this gap, this paper introduces AutoGeo, a novel approach for automatically generating mathematical geometric images to fulfill the demand for large-scale and diverse geometric datasets. AutoGeo facilitates the creation of AutoGeo-100k, an extensive repository comprising 100k high-quality geometry image-text pairs. By leveraging precisely defined geometric clauses, AutoGeo-100k contains a wide variety of geometric shapes, including lines, polygons, circles, and complex spatial relationships, etc. Furthermore, this paper demonstrates the efficacy of AutoGeo-100k in enhancing the performance of multimodal large language models through fine-tuning. Experimental results indicate significant improvements in the model's ability in handling geometric images, as evidenced by enhanced accuracy in tasks such as geometric captioning and mathematical reasoning. This research not only fills a critical gap in the availability of geometric datasets but also paves the way for the advancement of sophisticated AI-driven tools in education and research. Project page: https://autogeo-official.github.io/.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes