CVDec 10, 2019

3D-GMNet: Single-View 3D Shape Recovery as A Gaussian Mixture

Kohei Yamashita, Shohei Nobuhara, Ko Nishino

arXiv:1912.04663v25.44 citations

Originality Incremental advance

AI Analysis

This addresses the problem of efficient and versatile 3D reconstruction from single images for computer vision applications, though it appears incremental as it builds on existing deep learning methods with a new representation.

The paper tackles 3D object shape reconstruction from a single image by introducing 3D-GMNet, which represents shapes as Gaussian mixtures, resulting in accurate reconstruction with a small memory footprint and enabling applications like instant pose estimation.

In this paper, we introduce 3D-GMNet, a deep neural network for 3D object shape reconstruction from a single image. As the name suggests, 3D-GMNet recovers 3D shape as a Gaussian mixture. In contrast to voxels, point clouds, or meshes, a Gaussian mixture representation provides an analytical expression with a small memory footprint while accurately representing the target 3D shape. At the same time, it offers a number of additional advantages including instant pose estimation and controllable level-of-detail reconstruction, while also enabling interpretation as a point cloud, volume, and a mesh model. We train 3D-GMNet end-to-end with single input images and corresponding 3D models by introducing two novel loss functions, a 3D Gaussian mixture loss and a 2D multi-view loss, which collectively enable accurate shape reconstruction as kernel density estimation. We thoroughly evaluate the effectiveness of 3D-GMNet with synthetic and real images of objects. The results show accurate reconstruction with a compact representation that also realizes novel applications of single-image 3D reconstruction.

View on arXiv PDF

Similar