CVGRIVAug 28, 2018

3D-Aware Scene Manipulation via Inverse Graphics

arXiv:1808.09351v491 citations
Originality Incremental advance
AI Analysis

This work addresses the need for 3D-aware scene manipulation in computer vision and graphics, enabling tasks like object rotation and appearance changes, but it is incremental as it builds on existing disentangled representation concepts.

The paper tackles the problem of obtaining interpretable and disentangled 3D scene representations for objects, proposing 3D scene de-rendering networks (3D-SDN) that integrate semantics, geometry, and appearance, and demonstrates superior editing performance compared to 2D methods.

We aim to obtain an interpretable, expressive, and disentangled scene representation that contains comprehensive structural and textural information for each object. Previous scene representations learned by neural networks are often uninterpretable, limited to a single object, or lacking 3D knowledge. In this work, we propose 3D scene de-rendering networks (3D-SDN) to address the above issues by integrating disentangled representations for semantics, geometry, and appearance into a deep generative model. Our scene encoder performs inverse graphics, translating a scene into a structured object-wise representation. Our decoder has two components: a differentiable shape renderer and a neural texture generator. The disentanglement of semantics, geometry, and appearance supports 3D-aware scene manipulation, e.g., rotating and moving objects freely while keeping the consistent shape and texture, and changing the object appearance without affecting its shape. Experiments demonstrate that our editing scheme based on 3D-SDN is superior to its 2D counterpart.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes