A note on the unique properties of the Kullback--Leibler divergence for sampling via gradient flows
This is an incremental theoretical insight for researchers in computational statistics and machine learning, simplifying sampling algorithms.
The paper tackles the problem of sampling from a probability distribution by showing that the Kullback-Leibler divergence is uniquely advantageous among Bregman divergences, as its gradient flow does not require knowledge of the normalizing constant for many metrics.
We consider the problem of sampling from a probability distribution $π$. It is well known that this can be written as an optimisation problem over the space of probability distribution in which we aim to minimise a divergence from $π$. and The optimisation problem is normally solved through gradient flows in the space of probability distribution with an appropriate metric. We show that the Kullback--Leibler divergence is the only divergence in the family of Bregman divergences whose gradient flow w.r.t. many popular metrics does not require knowledge of the normalising constant of $π$.