OrientDream: Streamlining Text-to-3D Generation with Explicit Orientation Control
This work addresses fidelity and efficiency constraints in text-to-3D generation for applications in 3D modeling and visualization, representing an incremental improvement over Dreamfusion.
The paper tackles the multi-head Janus issue and slow optimization in text-to-3D generation by introducing OrientDream, a framework that uses explicit camera orientation control and a decoupled back-propagation technique, resulting in high-quality NeRF models with consistent multi-view properties and significantly faster optimization speed compared to existing methods.
In the evolving landscape of text-to-3D technology, Dreamfusion has showcased its proficiency by utilizing Score Distillation Sampling (SDS) to optimize implicit representations such as NeRF. This process is achieved through the distillation of pretrained large-scale text-to-image diffusion models. However, Dreamfusion encounters fidelity and efficiency constraints: it faces the multi-head Janus issue and exhibits a relatively slow optimization process. To circumvent these challenges, we introduce OrientDream, a camera orientation conditioned framework designed for efficient and multi-view consistent 3D generation from textual prompts. Our strategy emphasizes the implementation of an explicit camera orientation conditioned feature in the pre-training of a 2D text-to-image diffusion module. This feature effectively utilizes data from MVImgNet, an extensive external multi-view dataset, to refine and bolster its functionality. Subsequently, we utilize the pre-conditioned 2D images as a basis for optimizing a randomly initialized implicit representation (NeRF). This process is significantly expedited by a decoupled back-propagation technique, allowing for multiple updates of implicit parameters per optimization cycle. Our experiments reveal that our method not only produces high-quality NeRF models with consistent multi-view properties but also achieves an optimization speed significantly greater than existing methods, as quantified by comparative metrics.