CVApr 2, 2024

3D scene generation from scene graphs and self-attention

arXiv:2404.01887v3h-index: 25
Originality Incremental advance
AI Analysis

This addresses the need for controllable 3D scene generation for applications like simulated navigation and virtual reality, representing an incremental improvement over existing methods.

The paper tackles the problem of synthesizing realistic and diverse indoor 3D scene layouts from scene graphs and floor plans, achieving sparser scenes (7.9x compared to Graphto3D) and more diversity (16%).

Synthesizing realistic and diverse indoor 3D scene layouts in a controllable fashion opens up applications in simulated navigation and virtual reality. As concise and robust representations of a scene, scene graphs have proven to be well-suited as the semantic control on the generated layout. We present a variant of the conditional variational autoencoder (cVAE) model to synthesize 3D scenes from scene graphs and floor plans. We exploit the properties of self-attention layers to capture high-level relationships between objects in a scene, and use these as the building blocks of our model. Our model, leverages graph transformers to estimate the size, dimension and orientation of the objects in a room while satisfying relationships in the given scene graph. Our experiments shows self-attention layers leads to sparser (7.9x compared to Graphto3D) and more diverse scenes (16%).

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes