LGFeb 10, 2025

Outsourced diffusion sampling: Efficient posterior inference in latent spaces of generative models

arXiv:2502.06999v213 citationsh-index: 56ICML
Originality Highly original
AI Analysis

This work addresses the problem of efficient posterior inference for users of generative models, particularly those working with large pretrained prior models, and offers an incremental solution to existing methods.

The authors tackled the problem of posterior inference in latent spaces of generative models, achieving efficient sampling with their proposed outsourced diffusion sampling method, which compares favorably with other inference methods. They demonstrated its effectiveness in several experiments, including conditional image generation and protein structure generation.

Any well-behaved generative model over a variable $\mathbf{x}$ can be expressed as a deterministic transformation of an exogenous ('outsourced') Gaussian noise variable $\mathbf{z}$: $\mathbf{x}=f_θ(\mathbf{z})$. In such a model (\eg, a VAE, GAN, or continuous-time flow-based model), sampling of the target variable $\mathbf{x} \sim p_θ(\mathbf{x})$ is straightforward, but sampling from a posterior distribution of the form $p(\mathbf{x}\mid\mathbf{y}) \propto p_θ(\mathbf{x})r(\mathbf{x},\mathbf{y})$, where $r$ is a constraint function depending on an auxiliary variable $\mathbf{y}$, is generally intractable. We propose to amortize the cost of sampling from such posterior distributions with diffusion models that sample a distribution in the noise space ($\mathbf{z}$). These diffusion samplers are trained by reinforcement learning algorithms to enforce that the transformed samples $f_θ(\mathbf{z})$ are distributed according to the posterior in the data space ($\mathbf{x}$). For many models and constraints, the posterior in noise space is smoother than in data space, making it more suitable for amortized inference. Our method enables conditional sampling under unconditional GAN, (H)VAE, and flow-based priors, comparing favorably with other inference methods. We demonstrate the proposed outsourced diffusion sampling in several experiments with large pretrained prior models: conditional image generation, reinforcement learning with human feedback, and protein structure generation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes