CVNov 21, 2017

Aperture Supervision for Monocular Depth Estimation

arXiv:1711.07933v263 citations
Originality Incremental advance
AI Analysis

This addresses depth estimation for computer vision applications by providing a novel supervision method, though it appears incremental as it builds on existing depth estimation frameworks.

The paper tackles monocular depth estimation by using images taken with varying camera apertures as supervision, training a network to predict scene depths that explain defocus-blurred renderings of an all-in-focus image.

We present a novel method to train machine learning algorithms to estimate scene depths from a single image, by using the information provided by a camera's aperture as supervision. Prior works use a depth sensor's outputs or images of the same scene from alternate viewpoints as supervision, while our method instead uses images from the same viewpoint taken with a varying camera aperture. To enable learning algorithms to use aperture effects as supervision, we introduce two differentiable aperture rendering functions that use the input image and predicted depths to simulate the depth-of-field effects caused by real camera apertures. We train a monocular depth estimation network end-to-end to predict the scene depths that best explain these finite aperture images as defocus-blurred renderings of the input all-in-focus image.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes