CVNov 20, 2015

Multi-view 3D Models from Single Images with a Convolutional Network

arXiv:1511.06702v2387 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of 3D reconstruction from limited data for computer vision applications, but it is incremental as it builds on existing methods for single-view 3D prediction.

The paper tackles the problem of inferring 3D representations from single images by presenting a convolutional network that predicts RGB images and depth maps from arbitrary views, enabling full point clouds and surface meshes; it achieves reasonable predictions for real images of cars, trained on synthetic models.

We present a convolutional network capable of inferring a 3D representation of a previously unseen object given a single image of this object. Concretely, the network can predict an RGB image and a depth map of the object as seen from an arbitrary view. Several of these depth maps fused together give a full point cloud of the object. The point cloud can in turn be transformed into a surface mesh. The network is trained on renderings of synthetic 3D models of cars and chairs. It successfully deals with objects on cluttered background and generates reasonable predictions for real images of cars.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes