CVJun 14, 2024

OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control

arXiv:2406.10000v12 citations
AI Analysis

This work addresses fidelity and efficiency constraints in text-to-3D generation for applications in 3D modeling and visualization, representing an incremental improvement over Dreamfusion.

The paper tackles the multi-head Janus issue and slow optimization in text-to-3D generation by introducing OrientDream, a framework that uses explicit camera orientation control and a decoupled back-propagation technique, resulting in high-quality NeRF models with consistent multi-view properties and significantly faster optimization speed compared to existing methods.

In the evolving landscape of text-to-3D technology, Dreamfusion has showcased its proficiency by utilizing Score Distillation Sampling (SDS) to optimize implicit representations such as NeRF. This process is achieved through the distillation of pretrained large-scale text-to-image diffusion models. However, Dreamfusion encounters fidelity and efficiency constraints: it faces the multi-head Janus issue and exhibits a relatively slow optimization process. To circumvent these challenges, we introduce OrientDream, a camera orientation conditioned framework designed for efficient and multi-view consistent 3D generation from textual prompts. Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module. This feature effectively utilizes data from MVImgNet, an extensive external multi-view dataset, to refine and bolster its functionality. Subsequently, we utilize the pre-conditioned 2D images as a basis for optimizing a randomly initialized implicit representation (NeRF). This process is significantly expedited by a decoupled back-propagation technique, allowing for multiple updates of implicit parameters per optimization cycle. Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods, as quantified by comparative metrics.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes