CVApr 16

Geoparsing: Diagram Parsing for Plane and Solid Geometry with a Unified Formal Language

arXiv:2604.1160087.0h-index: 8
AI Analysis

This work addresses the perception bottleneck in geometric reasoning for multimodal LLMs by providing a formal language and dataset that covers both plane and solid geometry, enabling improved performance on geometry tasks.

The paper introduces a unified formal language for plane and solid geometry, constructs a large-scale dataset (GDP-29K) with 20k plane and 9k solid geometry samples, and proposes a training paradigm combining supervised fine-tuning with reinforcement learning via verifiable rewards, achieving state-of-the-art parsing performance and boosting downstream geometry reasoning in MLLMs.

Multimodal Large Language Models (MLLMs) have achieved remarkable progress but continue to struggle with geometric reasoning, primarily due to the perception bottleneck regarding fine-grained visual elements. While formal languages have aided plane geometry understanding, solid geometry which requires spatial understanding remains largely unexplored. In this paper, we address this challenge by designing a unified formal language that integrates plane and solid geometry, comprehensively covering geometric structures and semantic relations. We construct GDP-29K, a large-scale dataset comprising 20k plane and 9k solid geometry samples collected from diverse real-world sources, each paired with its ground-truth formal description. To ensure syntactic correctness and geometric consistency, we propose a training paradigm that combines Supervised Fine-Tuning with Reinforcement Learning via Verifiable Rewards. Experiments show that our approach achieves state-of-the-art parsing performance. Furthermore, we demonstrate that our parsed formal descriptions serve as a critical cognitive scaffold, significantly boosting MLLMs' capabilities for downstream geometry reasoning tasks. Our data and code are available at Geoparsing.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes