CVApr 2, 2024

3D scene generation from scene graphs and self-attention

Pietro Bonazzi, Mengqi Wang, Diego Martin Arroyo, Fabian Manhardt, Nico Messikomer, Federico Tombari, Davide Scaramuzza

arXiv:2404.01887v32.0h-index: 25Has Code

Originality Incremental advance

AI Analysis

This addresses the need for controllable 3D scene generation for applications like simulated navigation and virtual reality, representing an incremental improvement over existing methods.

The paper tackles the problem of synthesizing realistic and diverse indoor 3D scene layouts from scene graphs and floor plans, achieving sparser scenes (7.9x compared to Graphto3D) and more diversity (16%).

Synthesizing realistic and diverse indoor 3D scene layouts in a controllable fashion opens up applications in simulated navigation and virtual reality. As concise and robust representations of a scene, scene graphs have proven to be well-suited as the semantic control on the generated layout. We present a variant of the conditional variational autoencoder (cVAE) model to synthesize 3D scenes from scene graphs and floor plans. We exploit the properties of self-attention layers to capture high-level relationships between objects in a scene, and use these as the building blocks of our model. Our model, leverages graph transformers to estimate the size, dimension and orientation of the objects in a room while satisfying relationships in the given scene graph. Our experiments shows self-attention layers leads to sparser (7.9x compared to Graphto3D) and more diverse scenes (16%).

View on arXiv PDF Code

Similar