CVNov 29, 2018

InverseRenderNet: Learning single image inverse rendering

arXiv:1811.12328v1179 citationsHas Code
Originality Highly original
AI Analysis

This work addresses the challenge of estimating photometric invariants like albedo and normals from single images for computer vision applications, introducing a novel use of multiview stereo supervision.

The paper tackles the problem of inverse rendering from a single uncontrolled image by training a fully convolutional neural network to regress albedo and normal maps, using self-supervision with a differentiable renderer and additional supervision from multiview stereo to ensure consistency, achieving results without ground truth data.

We show how to train a fully convolutional neural network to perform inverse rendering from a single, uncontrolled image. The network takes an RGB image as input, regresses albedo and normal maps from which we compute lighting coefficients. Our network is trained using large uncontrolled image collections without ground truth. By incorporating a differentiable renderer, our network can learn from self-supervision. Since the problem is ill-posed we introduce additional supervision: 1. We learn a statistical natural illumination prior, 2. Our key insight is to perform offline multiview stereo (MVS) on images containing rich illumination variation. From the MVS pose and depth maps, we can cross project between overlapping views such that Siamese training can be used to ensure consistent estimation of photometric invariants. MVS depth also provides direct coarse supervision for normal map estimation. We believe this is the first attempt to use MVS supervision for learning inverse rendering.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes