SynCity: Training-Free Generation of 3D Worlds
This addresses the challenge of scalable 3D world generation for applications like gaming or virtual reality, though it appears incremental as it builds on existing models.
The paper tackles the problem of generating 3D worlds from text by proposing SynCity, a training-free method that combines pre-trained 3D and 2D generative models to create large, high-quality scenes, resulting in compelling and immersive outputs.
We address the challenge of generating 3D worlds from textual descriptions. We propose SynCity, a training- and optimization-free approach, which leverages the geometric precision of pre-trained 3D generative models and the artistic versatility of 2D image generators to create large, high-quality 3D spaces. While most 3D generative models are object-centric and cannot generate large-scale worlds, we show how 3D and 2D generators can be combined to generate ever-expanding scenes. Through a tile-based approach, we allow fine-grained control over the layout and the appearance of scenes. The world is generated tile-by-tile, and each new tile is generated within its world-context and then fused with the scene. SynCity generates compelling and immersive scenes that are rich in detail and diversity.