LGSep 26, 2025

Stage-wise Dynamics of Classifier-Free Guidance in Diffusion Models

arXiv:2509.22007v17 citationsh-index: 4
Originality Incremental advance
AI Analysis

This work addresses a key problem for users of diffusion models by providing a unified theoretical understanding of guidance dynamics, though it is incremental as it builds on prior studies of CFG.

The paper analyzes the dynamics of Classifier-Free Guidance in diffusion models under multimodal conditionals, revealing that the sampling process occurs in three stages (Direction Shift, Mode Separation, Concentration), which explains why stronger guidance improves semantic alignment but reduces diversity. Experiments show that early strong guidance erodes global diversity, while late strong guidance suppresses fine-grained variation, and a proposed time-varying guidance schedule improves both quality and diversity.

Classifier-Free Guidance (CFG) is widely used to improve conditional fidelity in diffusion models, but its impact on sampling dynamics remains poorly understood. Prior studies, often restricted to unimodal conditional distributions or simplified cases, provide only a partial picture. We analyze CFG under multimodal conditionals and show that the sampling process unfolds in three successive stages. In the Direction Shift stage, guidance accelerates movement toward the weighted mean, introducing initialization bias and norm growth. In the Mode Separation stage, local dynamics remain largely neutral, but the inherited bias suppresses weaker modes, reducing global diversity. In the Concentration stage, guidance amplifies within-mode contraction, diminishing fine-grained variability. This unified view explains a widely observed phenomenon: stronger guidance improves semantic alignment but inevitably reduces diversity. Experiments support these predictions, showing that early strong guidance erodes global diversity, while late strong guidance suppresses fine-grained variation. Moreover, our theory naturally suggests a time-varying guidance schedule, and empirical results confirm that it consistently improves both quality and diversity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes