ML LGFeb 27, 2024

Sequential transport maps using SoS density estimation and $α$-divergences

Benjamin Zanger, Olivier Zahm, Tiangang Cui, Martin Schreiber

arXiv:2402.17943v23.12 citationsh-index: 15Has Code

Originality Incremental advance

AI Analysis

This work provides incremental improvements to transport-based density estimation methods, benefiting researchers in machine learning and statistics dealing with generative modeling and high-dimensional problems.

The paper tackles the problem of density estimation and sample generation by improving sequential transport maps, using Sum-of-Squares densities and α-divergences to enable convex optimization and handle unnormalized densities, with numerical demonstrations on Bayesian inference and unsupervised learning tasks.

Transport-based density estimation methods are receiving growing interest because of their ability to efficiently generate samples from the approximated density. We further invertigate the sequential transport maps framework proposed from arXiv:2106.04170 arXiv:2303.02554, which builds on a sequence of composed Knothe-Rosenblatt (KR) maps. Each of those maps are built by first estimating an intermediate density of moderate complexity, and then by computing the exact KR map from a reference density to the precomputed approximate density. In our work, we explore the use of Sum-of-Squares (SoS) densities and $α$-divergences for approximating the intermediate densities. Combining SoS densities with $α$-divergence interestingly yields convex optimization problems which can be efficiently solved using semidefinite programming. The main advantage of $α$-divergences is to enable working with unnormalized densities, which provides benefits both numerically and theoretically. In particular, we provide a new convergence analyses of the sequential transport maps based on information geometric properties of $α$-divergences. The choice of intermediate densities is also crucial for the efficiency of the method. While tempered (or annealed) densities are the state-of-the-art, we introduce diffusion-based intermediate densities which permits to approximate densities known from samples only. Such intermediate densities are well-established in machine learning for generative modeling. Finally we propose low-dimensional maps (or lazy maps) for dealing with high-dimensional problems and numerically demonstrate our methods on Bayesian inference problems and unsupervised learning tasks.

View on arXiv PDF Code

Similar