CVJul 13, 2022

Joint Prediction of Monocular Depth and Structure using Planar and Parallax Geometry

arXiv:2207.06351v110 citationsh-index: 30
Originality Incremental advance
AI Analysis

This work addresses the challenge of obtaining high-quality depth data for computer vision applications, offering an incremental improvement over existing methods.

The paper tackles the problem of monocular depth estimation by combining structure information from planar and parallax geometry with depth data in a U-Net network, achieving the best performance in terms of relative error on the KITTI and Cityscapes datasets.

Supervised learning depth estimation methods can achieve good performance when trained on high-quality ground-truth, like LiDAR data. However, LiDAR can only generate sparse 3D maps which causes losing information. Obtaining high-quality ground-truth depth data per pixel is difficult to acquire. In order to overcome this limitation, we propose a novel approach combining structure information from a promising Plane and Parallax geometry pipeline with depth information into a U-Net supervised learning network, which results in quantitative and qualitative improvement compared to existing popular learning-based methods. In particular, the model is evaluated on two large-scale and challenging datasets: KITTI Vision Benchmark and Cityscapes dataset and achieve the best performance in terms of relative error. Compared with pure depth supervision models, our model has impressive performance on depth prediction of thin objects and edges, and compared to structure prediction baseline, our model performs more robustly.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes