CVMay 15, 2019

3D Semantic Scene Completion from a Single Depth Image using Adversarial Training

arXiv:1905.06231v121 citations
Originality Incremental advance
AI Analysis

This work addresses scene understanding for robotics or AR/VR applications, but it is incremental as it builds on existing GAN methods for a known task.

The paper tackles 3D semantic scene completion from a single depth image by exploring generative adversarial networks (GANs), finding that conditional GANs outperform vanilla GANs and a baseline 3D CNN with clean annotations, but performance degrades with poorly aligned annotations.

We address the task of 3D semantic scene completion, i.e. , given a single depth image, we predict the semantic labels and occupancy of voxels in a 3D grid representing the scene. In light of the recently introduced generative adversarial networks (GAN), our goal is to explore the potential of this model and the efficiency of various important design choices. Our results show that using conditional GANs outperforms the vanilla GAN setup. We evaluate these architecture designs on several datasets. Based on our experiments, we demonstrate that GANs are able to outperform the performance of a baseline 3D CNN in case of clean annotations, but they suffer from poorly aligned annotations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes