71.9MLApr 20
One-Shot Generative Flows: Existence and ObstructionsPanos Tsimpos, Daniel Sharp, Youssef Marzouk
We study dynamic measure transport for generative modelling in the setting of a stochastic process $X_\bullet$ whose marginals interpolate between a source distribution $P_0$ and a target distribution $P_1$ while remaining independent, i.e., when $(X_0,X_1)\sim P_0\otimes P_1$. Conditional expectations of this process $X_\bullet$ define an ODE whose flow map transports from $P_0$ to $P_1$. We discuss when such a process induces a \emph{straight-line flow}, namely one whose pointwise acceleration vanishes and is therefore exactly integrable by any first-order method. We first develop multiple characterizations of straightness in terms of PDEs involving the conditional statistics of the process. Then, we prove that straightness under endpoint independence exhibits a sharp dichotomy. On one hand, we construct explicit, computable straight-line processes for arbitrary Gaussian endpoints. On the other hand, we show straight-line processes do not exist for targets with sufficiently well-separated modes. We demonstrate this through a sequence of increasingly general impossibility theorems that uncover a fundamental relationship between the sample-path behavior of a process with independent endpoints and the space-time geometry of this process' flow map. Taken together, these results provide a structural theory of when straight generative flows can, and cannot, exist.
52.1MLMay 13
To discretize continually: Mean shift interacting particle systems for Bayesian inferenceAyoub Belhadji, Daniel Sharp, Youssef M. Marzouk
Integration against a probability distribution given its unnormalized density is a central task in Bayesian inference and other fields. We introduce new methods for approximating such expectations with a small set of weighted samples -- i.e., a quadrature rule -- constructed via an interacting particle system that minimizes maximum mean discrepancy (MMD) to the target distribution. These methods extend the classical mean shift algorithm, as well as recent algorithms for optimal quantization of empirical distributions, to the case of continuous distributions. Crucially, our approach creates dynamics for MMD minimization that are invariant to the unknown normalizing constant; they also admit both gradient-free and gradient-informed implementations. The resulting mean shift interacting particle systems converge quickly, capture anisotropy and multi-modality, avoid mode collapse, and scale to high dimensions. We demonstrate their performance on a wide range of benchmark sampling problems, including multi-modal mixtures, Bayesian hierarchical models, PDE-constrained inverse problems, and beyond.
MLFeb 14, 2025
Weighted quantization using MMD: From mean field to mean shift via gradient flowsAyoub Belhadji, Daniel Sharp, Youssef Marzouk
Approximating a probability distribution using a set of particles is a fundamental problem in machine learning and statistics, with applications including clustering and quantization. Formally, we seek a weighted mixture of Dirac measures that best approximates the target distribution. While much existing work relies on the Wasserstein distance to quantify approximation errors, maximum mean discrepancy (MMD) has received comparatively less attention, especially when allowing for variable particle weights. We argue that a Wasserstein-Fisher-Rao gradient flow is well-suited for designing quantizations optimal under MMD. We show that a system of interacting particles satisfying a set of ODEs discretizes this flow. We further derive a new fixed-point algorithm called mean shift interacting particles (MSIP). We show that MSIP extends the classical mean shift algorithm, widely used for identifying modes in kernel density estimators. Moreover, we show that MSIP can be interpreted as preconditioned gradient descent and that it acts as a relaxation of Lloyd's algorithm for clustering. Our unification of gradient flows, mean shift, and MMD-optimal quantization yields algorithms that are more robust than state-of-the-art methods, as demonstrated via high-dimensional and multi-modal numerical experiments.
89.2COMar 12
Sampling through iterated approximation: Gradient-free and multi-fidelity Bayesian inference via transportDaniel Sharp, Bart van Bloemen Waanders, Youssef Marzouk
We develop an iterative framework for Bayesian inference problems where the posterior distribution may involve computationally intensive models, intractable gradients, significant posterior concentration, and pronounced non-Gaussianity. Our approach integrates: (i) a generalized annealing scheme that combines geometric tempering with multi-fidelity modeling; (ii) expressive measure transport surrogates for the intermediate annealed and final target distributions, learned variationally without evaluating gradients of the target density; and, (iii) an importance-weighting scheme to combine multiple quadrature rules, which recycles and reweighs expensive model evaluations as successive posterior approximations are built. Our scheme produces both a quadrature rule for computing posterior expectations and a transport-based approximation of the posterior from which we can easily generate independent Monte Carlo samples. We demonstrate the efficiency and accuracy of our approach on low-dimensional but strongly non-Gaussian Bayesian inverse problems involving partial differential equations.