CVAug 27, 2022

Neural Camera Models

arXiv:2208.12903v11.4h-index: 8

Originality Synthesis-oriented

AI Analysis

This work addresses problems in computer vision for robotics and autonomous vehicles, but appears incremental as it builds on existing depth estimation methods.

The thesis tackles the challenges of depth estimation by addressing the lack of scalable ground truth labels, unreliable camera information, and restrictive camera assumptions, aiming to turn cameras into generic depth sensors.

Modern computer vision has moved beyond the domain of internet photo collections and into the physical world, guiding camera-equipped robots and autonomous cars through unstructured environments. To enable these embodied agents to interact with real-world objects, cameras are increasingly being used as depth sensors, reconstructing the environment for a variety of downstream reasoning tasks. Machine-learning-aided depth perception, or depth estimation, predicts for each pixel in an image the distance to the imaged scene point. While impressive strides have been made in depth estimation, significant challenges remain: (1) ground truth depth labels are difficult and expensive to collect at scale, (2) camera information is typically assumed to be known, but is often unreliable and (3) restrictive camera assumptions are common, even though a great variety of camera types and lenses are used in practice. In this thesis, we focus on relaxing these assumptions, and describe contributions toward the ultimate goal of turning cameras into truly generic depth sensors.

View on arXiv PDF

Similar