CVMar 8, 2024

Stealing Stable Diffusion Prior for Robust Monocular Depth Estimation

arXiv:2403.05056v16 citationsh-index: 3Has Code
Originality Incremental advance
AI Analysis

This work addresses the limitation of existing depth estimation methods that struggle in diverse scenarios due to lack of training data, offering an incremental improvement for applications in autonomous driving and robotics.

The paper tackles the problem of robust monocular depth estimation in challenging conditions like low-light or rain by using stable diffusion to generate synthetic training data and integrating DINOv2 for semantic priors, achieving effective results on nuScenes and Oxford RobotCar datasets.

Monocular depth estimation is a crucial task in computer vision. While existing methods have shown impressive results under standard conditions, they often face challenges in reliably performing in scenarios such as low-light or rainy conditions due to the absence of diverse training data. This paper introduces a novel approach named Stealing Stable Diffusion (SSD) prior for robust monocular depth estimation. The approach addresses this limitation by utilizing stable diffusion to generate synthetic images that mimic challenging conditions. Additionally, a self-training mechanism is introduced to enhance the model's depth estimation capability in such challenging environments. To enhance the utilization of the stable diffusion prior further, the DINOv2 encoder is integrated into the depth model architecture, enabling the model to leverage rich semantic priors and improve its scene understanding. Furthermore, a teacher loss is introduced to guide the student models in acquiring meaningful knowledge independently, thus reducing their dependency on the teacher models. The effectiveness of the approach is evaluated on nuScenes and Oxford RobotCar, two challenging public datasets, with the results showing the efficacy of the method. Source code and weights are available at: https://github.com/hitcslj/SSD.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes