CVSep 29, 2023

TextField3D: Towards Enhancing Open-Vocabulary 3D Generation with Noisy Text Fields

arXiv:2309.17175v210 citationsh-index: 56
Originality Incremental advance
AI Analysis

This addresses the challenge of generating diverse 3D content from text prompts for applications in computer graphics and AI, though it appears incremental as it builds on existing text-3D guidance methods.

The paper tackles the problem of limited text-3D data restricting open-vocabulary 3D generation by introducing TextField3D, a conditional 3D generative model that injects dynamic noise into text prompts to expand the textual latent space, achieving large vocabulary, text consistency, and low latency.

Recent works learn 3D representation explicitly under text-3D guidance. However, limited text-3D data restricts the vocabulary scale and text control of generations. Generators may easily fall into a stereotype concept for certain text prompts, thus losing open-vocabulary generation ability. To tackle this issue, we introduce a conditional 3D generative model, namely TextField3D. Specifically, rather than using the text prompts as input directly, we suggest to inject dynamic noise into the latent space of given text prompts, i.e., Noisy Text Fields (NTFs). In this way, limited 3D data can be mapped to the appropriate range of textual latent space that is expanded by NTFs. To this end, an NTFGen module is proposed to model general text latent code in noisy fields. Meanwhile, an NTFBind module is proposed to align view-invariant image latent code to noisy fields, further supporting image-conditional 3D generation. To guide the conditional generation in both geometry and texture, multi-modal discrimination is constructed with a text-3D discriminator and a text-2.5D discriminator. Compared to previous methods, TextField3D includes three merits: 1) large vocabulary, 2) text consistency, and 3) low latency. Extensive experiments demonstrate that our method achieves a potential open-vocabulary 3D generation capability.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes