CVMay 8

Learning Image-Adaptive Scale Fields for Metric Depth Recovery

arXiv:2605.0741823.8
AI Analysis

It addresses the practical problem of obtaining accurate metric depth from monocular depth estimation when only sparse metric anchors are available, which is crucial for applications like robotics and AR.

The paper proposes a method for recovering metric depth from monocular depth estimation using image-adaptive scale fields, achieving improved accuracy and robustness even with extremely sparse metric anchors.

Monocular depth estimation (MDE) typically produces depth estimations that are defined up to an unknown scale or shift. When only sparse metric anchors are available, recovering accurate metric depth becomes challenging yet necessary for practical applications. We address this problem by formulating metric depth recovery as image-adaptive scale field modeling. Instead of directly correcting the depth, we reformulate the correction as a low-dimensional linear combination of image-adaptive basis maps. These maps are derived from semantic and geometric cues encoded in the MDE estimations and intermediate representations. The weights of basis maps are efficiently determined from sparse metric anchors via a least-squares problem. This formulation yields improved metric depth accuracy, strong robustness under extreme anchor sparsity, and an interpretable decomposition of spatial scale variations. Extensive experiments across multiple datasets and representative MDE models demonstrate the effectiveness and general applicability of our approach.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes