GRCLJun 10, 2025

Token Perturbation Guidance for Diffusion Models

arXiv:2506.10036v214 citationsh-index: 6
Originality Incremental advance
AI Analysis

This addresses the problem of enhancing generation quality and alignment in diffusion models for researchers and practitioners, offering a more flexible, condition-agnostic alternative to CFG, though it is incremental as it builds on existing guidance techniques.

The paper tackled the limitations of classifier-free guidance (CFG) in diffusion models, which requires specific training and is limited to conditional generation, by proposing Token Perturbation Guidance (TPG), a training-free method that applies perturbation matrices to token representations, resulting in nearly a 2× improvement in FID for unconditional generation on SDXL and Stable Diffusion 2.1.

Classifier-free guidance (CFG) has become an essential component of modern diffusion models to enhance both generation quality and alignment with input conditions. However, CFG requires specific training procedures and is limited to conditional generation. To address these limitations, we propose Token Perturbation Guidance (TPG), a novel method that applies perturbation matrices directly to intermediate token representations within the diffusion network. TPG employs a norm-preserving shuffling operation to provide effective and stable guidance signals that improve generation quality without architectural changes. As a result, TPG is training-free and agnostic to input conditions, making it readily applicable to both conditional and unconditional generation. We further analyze the guidance term provided by TPG and show that its effect on sampling more closely resembles CFG compared to existing training-free guidance techniques. Extensive experiments on SDXL and Stable Diffusion 2.1 show that TPG achieves nearly a 2$\times$ improvement in FID for unconditional generation over the SDXL baseline, while closely matching CFG in prompt alignment. These results establish TPG as a general, condition-agnostic guidance method that brings CFG-like benefits to a broader class of diffusion models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes