DreamScape: 3D Scene Creation via Gaussian Splatting joint Correlation Modeling
This addresses the problem of text-to-3D scene generation for applications like virtual reality or gaming, but it appears incremental as it builds on existing diffusion models and Gaussian Splatting methods.
The paper tackles the challenge of generating 3D scenes with multiple objects from text by introducing DreamScape, which uses Gaussian Splatting with a 3D Gaussian Guide from LLMs and optimizes local-to-global with progressive scale control and collision modeling, achieving state-of-the-art performance for high-fidelity, controllable generation.
Recent advances in text-to-3D creation integrate the potent prior of Diffusion Models from text-to-image generation into 3D domain. Nevertheless, generating 3D scenes with multiple objects remains challenging. Therefore, we present DreamScape, a method for generating 3D scenes from text. Utilizing Gaussian Splatting for 3D representation, DreamScape introduces 3D Gaussian Guide that encodes semantic primitives, spatial transformations and relationships from text using LLMs, enabling local-to-global optimization. Progressive scale control is tailored during local object generation, addressing training instability issue arising from simple blending in the global optimization stage. Collision relationships between objects are modeled at the global level to mitigate biases in LLMs priors, ensuring physical correctness. Additionally, to generate pervasive objects like rain and snow distributed extensively across the scene, we design specialized sparse initialization and densification strategy. Experiments demonstrate that DreamScape achieves state-of-the-art performance, enabling high-fidelity, controllable 3D scene generation.