CVJun 25, 2018

Learning Single-Image Depth from Videos using Quality Assessment Networks

arXiv:1806.09573v317.171 citations

Originality Incremental advance

AI Analysis

This addresses the lack of training data for depth estimation in uncontrolled environments, benefiting computer vision applications, though it is incremental as it builds on existing SfM techniques.

The paper tackles the problem of single-image depth estimation in the wild by proposing a method to automatically generate high-quality training data using Structure-from-Motion on Internet videos, with a Quality Assessment Network to identify reliable reconstructions, resulting in the creation of the YouTube3D dataset that advances state-of-the-art performance.

Depth estimation from a single image in the wild remains a challenging problem. One main obstacle is the lack of high-quality training data for images in the wild. In this paper we propose a method to automatically generate such data through Structure-from-Motion (SfM) on Internet videos. The core of this method is a Quality Assessment Network that identifies high-quality reconstructions obtained from SfM. Using this method, we collect single-view depth training data from a large number of YouTube videos and construct a new dataset called YouTube3D. Experiments show that YouTube3D is useful in training depth estimation networks and advances the state of the art of single-view depth estimation in the wild.

View on arXiv PDF

Similar