CVJan 6, 2022

De-rendering 3D Objects in the Wild

arXiv:2201.02279v239 citations
AI Analysis

This addresses the need for scalable 3D object understanding in XR without extensive supervised data, though it is incremental as it builds on existing shape estimation methods.

The paper tackles the problem of decomposing single images of objects into 3D representations (shape, material, lighting) for augmented/virtual reality applications, using a weakly supervised method that only requires rough initial shape estimates and generalizes to unseen categories, with evaluation on a synthetic test set due to lack of real-world ground truth.

With increasing focus on augmented and virtual reality applications (XR) comes the demand for algorithms that can lift objects from images and videos into representations that are suitable for a wide variety of related 3D tasks. Large-scale deployment of XR devices and applications means that we cannot solely rely on supervised learning, as collecting and annotating data for the unlimited variety of objects in the real world is infeasible. We present a weakly supervised method that is able to decompose a single image of an object into shape (depth and normals), material (albedo, reflectivity and shininess) and global lighting parameters. For training, the method only relies on a rough initial shape estimate of the training objects to bootstrap the learning process. This shape supervision can come for example from a pretrained depth network or - more generically - from a traditional structure-from-motion pipeline. In our experiments, we show that the method can successfully de-render 2D images into a decomposed 3D representation and generalizes to unseen object categories. Since in-the-wild evaluation is difficult due to the lack of ground truth data, we also introduce a photo-realistic synthetic test set that allows for quantitative evaluation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes