CVNov 7, 2022

SC-DepthV3: Robust Self-supervised Monocular Depth Estimation for Dynamic Scenes

ByteDanceOxford
arXiv:2211.03660v2107 citationsh-index: 93Has Code
Originality Incremental advance
AI Analysis

This addresses the challenge of robust depth estimation in dynamic scenes for applications like autonomous driving and robotics, representing an incremental improvement over existing methods.

The paper tackles the problem of poor depth estimation accuracy and blurred object boundaries in dynamic scenes for self-supervised monocular depth estimation by introducing an external pretrained model to generate pseudo-depth and novel losses, resulting in significantly superior performance on six challenging datasets.

Self-supervised monocular depth estimation has shown impressive results in static scenes. It relies on the multi-view consistency assumption for training networks, however, that is violated in dynamic object regions and occlusions. Consequently, existing methods show poor accuracy in dynamic scenes, and the estimated depth map is blurred at object boundaries because they are usually occluded in other training views. In this paper, we propose SC-DepthV3 for addressing the challenges. Specifically, we introduce an external pretrained monocular depth estimation model for generating single-image depth prior, namely pseudo-depth, based on which we propose novel losses to boost self-supervised training. As a result, our model can predict sharp and accurate depth maps, even when training from monocular videos of highly-dynamic scenes. We demonstrate the significantly superior performance of our method over previous methods on six challenging datasets, and we provide detailed ablation studies for the proposed terms. Source code and data will be released at https://github.com/JiawangBian/sc_depth_pl

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes