CVJul 4, 2016

A Two-Streamed Network for Estimating Fine-Scaled Depth Maps from Single RGB Images

arXiv:1607.00730v442 citations
AI Analysis

This work addresses the need for more accurate and detailed 3D reconstructions from 2D images, which is important for applications like robotics and augmented reality, but it is incremental as it builds on existing deep learning methods.

The paper tackles the problem of estimating detailed depth maps from single RGB images, which often lack local detail when projected into 3D, by proposing a two-streamed CNN that predicts depth and depth gradients, achieving competitive accuracy on the NYU Depth v2 dataset.

Estimating depth from a single RGB image is an ill-posed and inherently ambiguous problem. State-of-the-art deep learning methods can now estimate accurate 2D depth maps, but when the maps are projected into 3D, they lack local detail and are often highly distorted. We propose a fast-to-train two-streamed CNN that predicts depth and depth gradients, which are then fused together into an accurate and detailed depth map. We also define a novel set loss over multiple images; by regularizing the estimation between a common set of images, the network is less prone to over-fitting and achieves better accuracy than competing methods. Experiments on the NYU Depth v2 dataset shows that our depth predictions are competitive with state-of-the-art and lead to faithful 3D projections.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes