CVMar 9, 2025

Adding Additional Control to One-Step Diffusion with Joint Distribution Matching

arXiv:2503.06652v29 citationsh-index: 15
Originality Highly original
AI Analysis

This work addresses the problem of efficiently updating one-step diffusion models for new controls, such as structural constraints or user preferences, for researchers and practitioners in generative AI, though it is incremental as it builds on existing distillation methods.

The paper tackles the challenge of adapting distilled diffusion models to new controls without costly retraining by introducing Joint Distribution Matching (JDM), which minimizes reverse KL divergence to decouple fidelity and condition learning, enabling one-step generation that outperforms multi-step baselines and achieves state-of-the-art in text-to-image synthesis.

While diffusion distillation has enabled one-step generation through methods like Variational Score Distillation, adapting distilled models to emerging new controls -- such as novel structural constraints or latest user preferences -- remains challenging. Conventional approaches typically requires modifying the base diffusion model and redistilling it -- a process that is both computationally intensive and time-consuming. To address these challenges, we introduce Joint Distribution Matching (JDM), a novel approach that minimizes the reverse KL divergence between image-condition joint distributions. By deriving a tractable upper bound, JDM decouples fidelity learning from condition learning. This asymmetric distillation scheme enables our one-step student to handle controls unknown to the teacher model and facilitates improved classifier-free guidance (CFG) usage and seamless integration of human feedback learning (HFL). Experimental results demonstrate that JDM surpasses baseline methods such as multi-step ControlNet by mere one-step in most cases, while achieving state-of-the-art performance in one-step text-to-image synthesis through improved usage of CFG or HFL integration.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes