ML LGFeb 18, 2025

Composition and Control with Distilled Energy Diffusion Models and Sequential Monte Carlo

James Thornton, Louis Bethune, Ruixiang Zhang, Arwen Bradley, Preetum Nakkiran, Shuangfei Zhai

AppleStanford

arXiv:2502.12786v125.723 citationsh-index: 19AISTATS

Originality Incremental advance

AI Analysis

This work addresses a specific bottleneck in diffusion models for researchers and practitioners, offering incremental improvements in training stability and sampling control.

The authors tackled the problem of inferior performance in energy-parameterized diffusion models by introducing a novel training regime through distillation of pre-trained models, achieving improved control and composition in generation via sequential Monte Carlo.

Diffusion models may be formulated as a time-indexed sequence of energy-based models, where the score corresponds to the negative gradient of an energy function. As opposed to learning the score directly, an energy parameterization is attractive as the energy itself can be used to control generation via Monte Carlo samplers. Architectural constraints and training instability in energy parameterized models have so far yielded inferior performance compared to directly approximating the score or denoiser. We address these deficiencies by introducing a novel training regime for the energy function through distillation of pre-trained diffusion models, resembling a Helmholtz decomposition of the score vector field. We further showcase the synergies between energy and score by casting the diffusion sampling procedure as a Feynman Kac model where sampling is controlled using potentials from the learnt energy functions. The Feynman Kac model formalism enables composition and low temperature sampling through sequential Monte Carlo.

View on arXiv PDF

Similar