LGJan 8

Milestones over Outcome: Unlocking Geometric Reasoning with Sub-Goal Verifiable Reward

Jianlong Chen, Daocheng Fu, Shengze Xu, Jiawei Chen, Yuan Feng, Yue Yang, Junchi Yan, Hongyuan Zha, Renqiu Xia

arXiv:2601.05073v13.82 citationsh-index: 10

Originality Highly original

AI Analysis

This addresses a critical bottleneck in geometric reasoning for AI systems, offering a novel approach with broad applicability beyond geometry.

The paper tackles the problem of multimodal large language models struggling with complex geometric reasoning due to outcome-based supervision, and introduces a subgoal-level evaluation and learning paradigm that enhances geometric performance by 9.7% and generalizes to other tasks.

Multimodal Large Language Models (MLLMs) struggle with complex geometric reasoning, largely because "black box" outcome-based supervision fails to distinguish between lucky guesses and rigorous deduction. To address this, we introduce a paradigm shift towards subgoal-level evaluation and learning. We first construct GeoGoal, a benchmark synthesized via a rigorous formal verification data engine, which converts abstract proofs into verifiable numeric subgoals. This structure reveals a critical divergence between reasoning quality and outcome accuracy. Leveraging this, we propose the Sub-Goal Verifiable Reward (SGVR) framework, which replaces sparse signals with dense rewards based on the Skeleton Rate. Experiments demonstrate that SGVR not only enhances geometric performance (+9.7%) but also exhibits strong generalization, transferring gains to general math (+8.0%) and other general reasoning tasks (+2.8%), demonstrating broad applicability across diverse domains.

View on arXiv PDF

Similar