CVMay 10, 2023

FusionDepth: Complement Self-Supervised Monocular Depth Estimation with Cost Volume

arXiv:2305.06036v1
Originality Incremental advance
AI Analysis

This work addresses depth estimation for autonomous driving and robotics by combining monocular and multi-view techniques to handle moving objects and low-textured surfaces, representing an incremental advancement.

The paper tackles the problem of improving self-supervised monocular depth estimation by integrating it with multi-view stereo cost volumes, resulting in a method that surpasses state-of-the-art unsupervised approaches on the KITTI benchmark.

Multi-view stereo depth estimation based on cost volume usually works better than self-supervised monocular depth estimation except for moving objects and low-textured surfaces. So in this paper, we propose a multi-frame depth estimation framework which monocular depth can be refined continuously by multi-frame sequential constraints, leveraging a Bayesian fusion layer within several iterations. Both monocular and multi-view networks can be trained with no depth supervision. Our method also enhances the interpretability when combining monocular estimation with multi-view cost volume. Detailed experiments show that our method surpasses state-of-the-art unsupervised methods utilizing single or multiple frames at test time on KITTI benchmark.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes