LGOct 1, 2025

Diffusion Alignment as Variational Expectation-Maximization

arXiv:2510.00502v11 citationsh-index: 10
Originality Incremental advance
AI Analysis

This addresses the challenge of aligning diffusion models with downstream objectives without sacrificing diversity, which is crucial for applications in generative AI and bioinformatics.

The paper tackled the problem of diffusion alignment, where existing methods often suffer from reward over-optimization and mode collapse, by introducing DAV, a framework that formulates it as variational expectation-maximization, and demonstrated it optimizes reward while preserving diversity in tasks like text-to-image synthesis and DNA sequence design.

Diffusion alignment aims to optimize diffusion models for the downstream objective. While existing methods based on reinforcement learning or direct backpropagation achieve considerable success in maximizing rewards, they often suffer from reward over-optimization and mode collapse. We introduce Diffusion Alignment as Variational Expectation-Maximization (DAV), a framework that formulates diffusion alignment as an iterative process alternating between two complementary phases: the E-step and the M-step. In the E-step, we employ test-time search to generate diverse and reward-aligned samples. In the M-step, we refine the diffusion model using samples discovered by the E-step. We demonstrate that DAV can optimize reward while preserving diversity for both continuous and discrete tasks: text-to-image synthesis and DNA sequence design.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes