CVJun 14, 2024

Unsupervised Monocular Depth Estimation Based on Hierarchical Feature-Guided Diffusion

arXiv:2406.09782v31 citations
Originality Incremental advance
AI Analysis

This work addresses robust depth estimation from single images in real-world scenarios like adverse weather, which is important for applications in autonomous driving and robotics, though it is incremental as it builds on existing generative methods.

The paper tackles unsupervised monocular depth estimation by proposing a hierarchical feature-guided diffusion model with an implicit depth consistency loss, achieving state-of-the-art results among generative-based models on datasets like KITTI and Make3D, with demonstrated robustness.

Unsupervised monocular depth estimation has received widespread attention because of its capability to train without ground truth. In real-world scenarios, the images may be blurry or noisy due to the influence of weather conditions and inherent limitations of the camera. Therefore, it is particularly important to develop a robust depth estimation model. Benefiting from the training strategies of generative networks, generative-based methods often exhibit enhanced robustness. In light of this, we employ a well-converging diffusion model among generative networks for unsupervised monocular depth estimation. Additionally, we propose a hierarchical feature-guided denoising module. This model significantly enriches the model's capacity for learning and interpreting depth distribution by fully leveraging image features to guide the denoising process. Furthermore, we explore the implicit depth within reprojection and design an implicit depth consistency loss. This loss function serves to enhance the performance of the model and ensure the scale consistency of depth within a video sequence. We conduct experiments on the KITTI, Make3D, and our self-collected SIMIT datasets. The results indicate that our approach stands out among generative-based models, while also showcasing remarkable robustness.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes