Programmable-Room: Interactive Textured 3D Room Meshes Generation Empowered by Large Language Models
This work addresses the challenge of interactive 3D room generation for applications in design and simulation, representing an incremental advancement by integrating existing methods like LLMs and diffusion models into a novel framework.
The authors tackled the problem of generating and editing 3D room meshes from natural language instructions by decomposing it into steps like creating coordinates and textures, using a framework that incorporates visual programming with LLMs and a diffusion model for panorama generation, resulting in demonstrated flexibility and superiority over existing models in quantitative and qualitative evaluations.
We present Programmable-Room, a framework which interactively generates and edits a 3D room mesh, given natural language instructions. For precise control of a room's each attribute, we decompose the challenging task into simpler steps such as creating plausible 3D coordinates for room meshes, generating panorama images for the texture, constructing 3D meshes by integrating the coordinates and panorama texture images, and arranging furniture. To support the various decomposed tasks with a unified framework, we incorporate visual programming (VP). VP is a method that utilizes a large language model (LLM) to write a Python-like program which is an ordered list of necessary modules for the various tasks given in natural language. We develop most of the modules. Especially, for the texture generating module, we utilize a pretrained large-scale diffusion model to generate panorama images conditioned on text and visual prompts (i.e., layout, depth, and semantic map) simultaneously. Specifically, we enhance the panorama image generation quality by optimizing the training objective with a 1D representation of a panorama scene obtained from bidirectional LSTM. We demonstrate Programmable-Room's flexibility in generating and editing 3D room meshes, and prove our framework's superiority to an existing model quantitatively and qualitatively. Project page is available in https://jihyun0510.github.io/Programmable_Room_Page/.