CVJul 3, 2025

MoGe-2: Accurate Monocular Geometry with Metric Scale and Sharp Details

arXiv:2507.02546v1165 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses the challenge of accurate monocular geometry estimation with metric scale and sharp details for applications in computer vision and robotics, representing a significant but incremental advancement over prior approaches.

The paper tackles the problem of recovering metric-scale 3D geometry from a single image by proposing MoGe-2, which extends an existing affine-invariant method to predict metric scales and refines real data with synthetic labels to enhance detail, achieving superior performance in accuracy, scale precision, and detail recovery compared to previous methods.

We propose MoGe-2, an advanced open-domain geometry estimation model that recovers a metric scale 3D point map of a scene from a single image. Our method builds upon the recent monocular geometry estimation approach, MoGe, which predicts affine-invariant point maps with unknown scales. We explore effective strategies to extend MoGe for metric geometry prediction without compromising the relative geometry accuracy provided by the affine-invariant point representation. Additionally, we discover that noise and errors in real data diminish fine-grained detail in the predicted geometry. We address this by developing a unified data refinement approach that filters and completes real data from different sources using sharp synthetic labels, significantly enhancing the granularity of the reconstructed geometry while maintaining the overall accuracy. We train our model on a large corpus of mixed datasets and conducted comprehensive evaluations, demonstrating its superior performance in achieving accurate relative geometry, precise metric scale, and fine-grained detail recovery -- capabilities that no previous methods have simultaneously achieved.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes