ML LGMay 20, 2023

Normalizing flow sampling with Langevin dynamics in the latent space

Florentin Coeurdoux, Nicolas Dobigeon, Pierre Chainais

arXiv:2305.12149v111.89 citations

Originality Incremental advance

AI Analysis

This addresses a specific issue in generative modeling for researchers using normalizing flows, offering an incremental improvement to enhance sampling in complex scenarios.

The paper tackles the problem of normalizing flows suffering from pathological behaviors when targeting complex distributions with multiple modes or separated high-probability regions, by proposing a Markov chain Monte Carlo algorithm using Metropolis adjusted Langevin dynamics in the latent space, which preserves likelihood tractability and works with any pre-trained flow without retraining, showing efficiency on synthetic and high-dimensional real datasets.

Normalizing flows (NF) use a continuous generator to map a simple latent (e.g. Gaussian) distribution, towards an empirical target distribution associated with a training data set. Once trained by minimizing a variational objective, the learnt map provides an approximate generative model of the target distribution. Since standard NF implement differentiable maps, they may suffer from pathological behaviors when targeting complex distributions. For instance, such problems may appear for distributions on multi-component topologies or characterized by multiple modes with high probability regions separated by very unlikely areas. A typical symptom is the explosion of the Jacobian norm of the transformation in very low probability areas. This paper proposes to overcome this issue thanks to a new Markov chain Monte Carlo algorithm to sample from the target distribution in the latent domain before transporting it back to the target domain. The approach relies on a Metropolis adjusted Langevin algorithm (MALA) whose dynamics explicitly exploits the Jacobian of the transformation. Contrary to alternative approaches, the proposed strategy preserves the tractability of the likelihood and it does not require a specific training. Notably, it can be straightforwardly used with any pre-trained NF network, regardless of the architecture. Experiments conducted on synthetic and high-dimensional real data sets illustrate the efficiency of the method.

View on arXiv PDF

Similar