Deep disentangled representations for volumetric reconstruction
This work addresses the challenge of 3D reconstruction from 2D images for applications in computer vision and graphics, though it appears incremental as it builds on existing disentanglement and reconstruction methods.
The authors tackled the problem of inferring compact disentangled graphical descriptions from 2D images for volumetric reconstruction, achieving this by introducing a convolutional neural network with an encoder and twin-tailed decoder that learns separate representations for 3D objects and their lighting/pose conditions.
We introduce a convolutional neural network for inferring a compact disentangled graphical description of objects from 2D images that can be used for volumetric reconstruction. The network comprises an encoder and a twin-tailed decoder. The encoder generates a disentangled graphics code. The first decoder generates a volume, and the second decoder reconstructs the input image using a novel training regime that allows the graphics code to learn a separate representation of the 3D object and a description of its lighting and pose conditions. We demonstrate this method by generating volumes and disentangled graphical descriptions from images and videos of faces and chairs.