CVMar 24, 2025

CFG-Zero*: Improved Classifier-Free Guidance for Flow Matching Models

Weichen Fan, Amber Yijia Zheng, Raymond A. Yeh, Ziwei Liu

arXiv:2503.18886v237 citationsh-index: 4Has Code

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in flow matching models for generative AI, offering an incremental improvement to CFG for better image and video generation.

The paper tackled the problem of Classifier-Free Guidance (CFG) causing incorrect trajectories in flow matching models during early training stages, and proposed CFG-Zero* with optimized scale and zero-init steps, which consistently outperformed CFG in experiments on text-to-image and text-to-video generation.

Classifier-Free Guidance (CFG) is a widely adopted technique in diffusion/flow models to improve image fidelity and controllability. In this work, we first analytically study the effect of CFG on flow matching models trained on Gaussian mixtures where the ground-truth flow can be derived. We observe that in the early stages of training, when the flow estimation is inaccurate, CFG directs samples toward incorrect trajectories. Building on this observation, we propose CFG-Zero*, an improved CFG with two contributions: (a) optimized scale, where a scalar is optimized to correct for the inaccuracies in the estimated velocity, hence the * in the name; and (b) zero-init, which involves zeroing out the first few steps of the ODE solver. Experiments on both text-to-image (Lumina-Next, Stable Diffusion 3, and Flux) and text-to-video (Wan-2.1) generation demonstrate that CFG-Zero* consistently outperforms CFG, highlighting its effectiveness in guiding Flow Matching models. (Code is available at github.com/WeichenFan/CFG-Zero-star)

View on arXiv PDF Code

Similar