CVMar 24, 2023

DreamStone: Image as Stepping Stone for Text-Guided 3D Shape Generation

arXiv:2303.15181v312 citationsh-index: 66Has Code
Originality Highly original
AI Analysis

This addresses the challenge of text-to-3D generation for applications like computer graphics and virtual reality, offering a flexible and scalable solution.

The paper tackles the problem of generating 3D shapes from text without paired data by using images as an intermediate step, achieving state-of-the-art results in generative quality and text consistency.

In this paper, we present a new text-guided 3D shape generation approach DreamStone that uses images as a stepping stone to bridge the gap between text and shape modalities for generating 3D shapes without requiring paired text and 3D data. The core of our approach is a two-stage feature-space alignment strategy that leverages a pre-trained single-view reconstruction (SVR) model to map CLIP features to shapes: to begin with, map the CLIP image feature to the detail-rich 3D shape space of the SVR model, then map the CLIP text feature to the 3D shape space through encouraging the CLIP-consistency between rendered images and the input text. Besides, to extend beyond the generative capability of the SVR model, we design a text-guided 3D shape stylization module that can enhance the output shapes with novel structures and textures. Further, we exploit pre-trained text-to-image diffusion models to enhance the generative diversity, fidelity, and stylization capability. Our approach is generic, flexible, and scalable, and it can be easily integrated with various SVR models to expand the generative space and improve the generative fidelity. Extensive experimental results demonstrate that our approach outperforms the state-of-the-art methods in terms of generative quality and consistency with the input text. Codes and models are released at https://github.com/liuzhengzhe/DreamStone-ISS.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes