MLLGJun 16, 2021

KALE Flow: A Relaxed KL Gradient Flow for Probabilities with Disjoint Support

arXiv:2106.08929v246 citations
Originality Incremental advance
AI Analysis

This work addresses a foundational challenge in machine learning for handling singular distributions, particularly useful when target distributions lie on low-dimensional manifolds, though it is incremental as it builds on existing divergence approximations.

The paper tackles the problem of gradient flow for probability distributions with disjoint support by introducing KALE, a relaxed approximation to the KL divergence that interpolates between KL and MMD, showing global convergence under smoothness assumptions and validating properties empirically.

We study the gradient flow for a relaxed approximation to the Kullback-Leibler (KL) divergence between a moving source and a fixed target distribution. This approximation, termed the KALE (KL approximate lower-bound estimator), solves a regularized version of the Fenchel dual problem defining the KL over a restricted class of functions. When using a Reproducing Kernel Hilbert Space (RKHS) to define the function class, we show that the KALE continuously interpolates between the KL and the Maximum Mean Discrepancy (MMD). Like the MMD and other Integral Probability Metrics, the KALE remains well defined for mutually singular distributions. Nonetheless, the KALE inherits from the limiting KL a greater sensitivity to mismatch in the support of the distributions, compared with the MMD. These two properties make the KALE gradient flow particularly well suited when the target distribution is supported on a low-dimensional manifold. Under an assumption of sufficient smoothness of the trajectories, we show the global convergence of the KALE flow. We propose a particle implementation of the flow given initial samples from the source and the target distribution, which we use to empirically confirm the KALE's properties.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes