LGAIMay 16, 2024

Flow Score Distillation for Diverse Text-to-3D Generation

arXiv:2405.10988v26 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the diversity limitation in text-to-3D generation for creative applications, representing an incremental improvement over existing SDS-based methods.

The paper tackles the limited diversity problem in text-to-3D generation with Score Distillation Sampling (SDS) by showing SDS is analogous to DDIM generation and proposing Flow Score Distillation (FSD) with a novel noise sampling approach, which substantially enhances generation diversity without compromising quality across various diffusion models.

Recent advancements in Text-to-3D generation have yielded remarkable progress, particularly through methods that rely on Score Distillation Sampling (SDS). While SDS exhibits the capability to create impressive 3D assets, it is hindered by its inherent maximum-likelihood-seeking essence, resulting in limited diversity in generation outcomes. In this paper, we discover that the Denoise Diffusion Implicit Models (DDIM) generation process (\ie PF-ODE) can be succinctly expressed using an analogue of SDS loss. One step further, one can see SDS as a generalized DDIM generation process. Following this insight, we show that the noise sampling strategy in the noise addition stage significantly restricts the diversity of generation results. To address this limitation, we present an innovative noise sampling approach and introduce a novel text-to-3D method called Flow Score Distillation (FSD). Our validation experiments across various text-to-image Diffusion Models demonstrate that FSD substantially enhances generation diversity without compromising quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes