CVJun 14, 2023

ZeroForge: Feedforward Text-to-Shape Without 3D Supervision

Kelly O. Marshall, Minh Pham, Ameya Joshi, Anushrut Jignasu, Aditya Balu, Adarsh Krishnamurthy, Chinmay Hegde

arXiv:2306.08183v23.93 citationsh-index: 16Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of generating 3D shapes from text for applications in design and visualization, offering a more efficient zero-shot approach.

The paper tackles the problem of text-to-shape generation without requiring 3D supervision or expensive inference-time optimization, achieving open-vocabulary shape generation through architectural adaptations and loss functions.

Current state-of-the-art methods for text-to-shape generation either require supervised training using a labeled dataset of pre-defined 3D shapes, or perform expensive inference-time optimization of implicit neural representations. In this work, we present ZeroForge, an approach for zero-shot text-to-shape generation that avoids both pitfalls. To achieve open-vocabulary shape generation, we require careful architectural adaptation of existing feed-forward approaches, as well as a combination of data-free CLIP-loss and contrastive losses to avoid mode collapse. Using these techniques, we are able to considerably expand the generative ability of existing feed-forward text-to-shape models such as CLIP-Forge. We support our method via extensive qualitative and quantitative evaluations

View on arXiv PDF Code

Similar