CVAug 9, 2024

FlowDreamer: Exploring High Fidelity Text-to-3D Generation via Rectified Flow

arXiv:2408.05008v37 citationsh-index: 6
Originality Incremental advance
AI Analysis

This work addresses a key bottleneck in text-to-3D generation for applications like computer graphics and virtual reality, offering an incremental improvement over existing methods.

The paper tackles the problem of over-smoothing textures and over-saturating colors in text-to-3D generation by proposing FlowDreamer, a framework that integrates rectified flow models with Score Distillation Sampling, resulting in high fidelity 3D models with richer textual details and faster convergence.

Recent advances in text-to-3D generation have made significant progress. In particular, with the pretrained diffusion models, existing methods predominantly use Score Distillation Sampling (SDS) to train 3D models such as Neural RaRecent advances in text-to-3D generation have made significant progress. In particular, with the pretrained diffusion models, existing methods predominantly use Score Distillation Sampling (SDS) to train 3D models such as Neural Radiance Fields (NeRF) and 3D Gaussian Splatting (3D GS). However, a hurdle is that they often encounter difficulties with over-smoothing textures and over-saturating colors. The rectified flow model -- which utilizes a simple ordinary differential equation (ODE) to represent a straight trajectory -- shows promise as an alternative prior to text-to-3D generation. It learns a time-independent vector field, thereby reducing the ambiguity in 3D model update gradients that are calculated using time-dependent scores in the SDS framework. In light of this, we first develop a mathematical analysis to seamlessly integrate SDS with rectified flow model, paving the way for our initial framework known as Vector Field Distillation Sampling (VFDS). However, empirical findings indicate that VFDS still results in over-smoothing outcomes. Therefore, we analyze the grounding reasons for such a failure from the perspective of ODE trajectories. On top, we propose a novel framework, named FlowDreamer, which yields high fidelity results with richer textual details and faster convergence. The key insight is to leverage the coupling and reversible properties of the rectified flow model to search for the corresponding noise, rather than using randomly sampled noise as in VFDS. Accordingly, we introduce a novel Unique Couple Matching (UCM) loss, which guides the 3D model to optimize along the same trajectory.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes