DreamMesh: Jointly Manipulating and Texturing Triangle Meshes for Text-to-3D Generation
This work solves the problem of producing explicit, textured 3D meshes from text for applications in 3D modeling and content creation, representing an incremental improvement over existing text-to-3D methods.
The paper tackles the problem of generating high-fidelity 3D models from text by addressing issues like noisy surfaces and cross-view inconsistency in implicit representations like NeRF, resulting in DreamMesh, which significantly outperforms state-of-the-art methods in generating 3D content with richer textual details and enhanced geometry.
Learning radiance fields (NeRF) with powerful 2D diffusion models has garnered popularity for text-to-3D generation. Nevertheless, the implicit 3D representations of NeRF lack explicit modeling of meshes and textures over surfaces, and such surface-undefined way may suffer from the issues, e.g., noisy surfaces with ambiguous texture details or cross-view inconsistency. To alleviate this, we present DreamMesh, a novel text-to-3D architecture that pivots on well-defined surfaces (triangle meshes) to generate high-fidelity explicit 3D model. Technically, DreamMesh capitalizes on a distinctive coarse-to-fine scheme. In the coarse stage, the mesh is first deformed by text-guided Jacobians and then DreamMesh textures the mesh with an interlaced use of 2D diffusion models in a tuning free manner from multiple viewpoints. In the fine stage, DreamMesh jointly manipulates the mesh and refines the texture map, leading to high-quality triangle meshes with high-fidelity textured materials. Extensive experiments demonstrate that DreamMesh significantly outperforms state-of-the-art text-to-3D methods in faithfully generating 3D content with richer textual details and enhanced geometry. Our project page is available at https://dreammesh.github.io.