A unified perspective on fine-tuning and sampling with diffusion and flow models

Carles Domingo-Enrich, Yuanqi Du, Michael S. Albergo

arXiv:2605.0022990.31 citations

AI Analysis

For researchers working on generative modeling and fine-tuning, this work offers theoretical insights and practical guidance for choosing stable training methods.

This paper provides a unified framework for fine-tuning and sampling with diffusion and flow models under exponential tilting, revealing that Adjoint Matching and Novel Score Matching have finite gradient variance while Target and Conditional Score Matching do not, and validates the analysis with reward fine-tuning experiments on Stable Diffusion 1.5 and 3.

We study the problem of training diffusion and flow generative models to sample from target distributions defined by an exponential tilting of a base density; a formulation that subsumes both sampling from unnormalized densities and reward fine-tuning of pre-trained models. This problem can be approached from a stochastic optimal control (SOC) perspective, using adjoint-based or score matching methods, or from a non-equilibrium thermodynamics perspective. We provide a unified framework encompassing these approaches and make three main contributions: (i) bias-variance decompositions revealing that Adjoint Matching/Sampling and Novel Score Matching have finite gradient variance, while Target and Conditional Score Matching do not; (ii) norm bounds on the lean adjoint ODE that theoretically support the effectiveness of adjoint-based methods; and (iii) adaptations of the CMCD and NETS loss functions, along with novel Crooks and Jarzynski identities, to the exponential tilting setting. We validate our analysis with reward fine-tuning experiments on Stable Diffusion 1.5 and 3.

View on arXiv PDF

Similar