CVDec 12, 2017

Im2Pano3D: Extrapolating 360 Structure and Semantics Beyond the Field of View

arXiv:1712.04569v179 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of scene understanding and reconstruction for robotics or VR applications, though it is incremental as it builds on existing contextual priors and parameterization methods.

The paper tackles the problem of generating a full 360-degree panoramic view of 3D structure and semantics from a partial RGB-D image (≤50% observation) in indoor scenes, achieving over 56% pixel accuracy and less than 0.52m average distance error.

We present Im2Pano3D, a convolutional neural network that generates a dense prediction of 3D structure and a probability distribution of semantic labels for a full 360 panoramic view of an indoor scene when given only a partial observation (<= 50%) in the form of an RGB-D image. To make this possible, Im2Pano3D leverages strong contextual priors learned from large-scale synthetic and real-world indoor scenes. To ease the prediction of 3D structure, we propose to parameterize 3D surfaces with their plane equations and train the model to predict these parameters directly. To provide meaningful training supervision, we use multiple loss functions that consider both pixel level accuracy and global context consistency. Experiments demon- strate that Im2Pano3D is able to predict the semantics and 3D structure of the unobserved scene with more than 56% pixel accuracy and less than 0.52m average distance error, which is significantly better than alternative approaches.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes