CVMar 21, 2025

Distilling Monocular Foundation Model for Fine-grained Depth Completion

arXiv:2503.16970v120 citationsh-index: 5Has CodeCVPR
Originality Incremental advance
AI Analysis

This work addresses depth completion for autonomous driving by providing a method to overcome sparse LiDAR data limitations, though it is incremental as it builds on existing foundation models.

The paper tackles the problem of limited dense supervision for depth completion by proposing a two-stage knowledge distillation framework that leverages monocular foundation models to generate training data and handle scale ambiguity, achieving state-of-the-art performance with first place on the KITTI benchmark.

Depth completion involves predicting dense depth maps from sparse LiDAR inputs. However, sparse depth annotations from sensors limit the availability of dense supervision, which is necessary for learning detailed geometric features. In this paper, we propose a two-stage knowledge distillation framework that leverages powerful monocular foundation models to provide dense supervision for depth completion. In the first stage, we introduce a pre-training strategy that generates diverse training data from natural images, which distills geometric knowledge to depth completion. Specifically, we simulate LiDAR scans by utilizing monocular depth and mesh reconstruction, thereby creating training data without requiring ground-truth depth. Besides, monocular depth estimation suffers from inherent scale ambiguity in real-world settings. To address this, in the second stage, we employ a scale- and shift-invariant loss (SSI Loss) to learn real-world scales when fine-tuning on real-world datasets. Our two-stage distillation framework enables depth completion models to harness the strengths of monocular foundation models. Experimental results demonstrate that models trained with our two-stage distillation framework achieve state-of-the-art performance, ranking \textbf{first place} on the KITTI benchmark. Code is available at https://github.com/Sharpiless/DMD3C

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes