MLITLGPRJul 7, 2020

Variational Representations and Neural Network Estimation of Rényi Divergences

arXiv:2007.03814v440 citations
Originality Highly original
AI Analysis

This work addresses the problem of estimating divergences in high-dimensional systems for researchers in machine learning and statistics, offering a more efficient alternative to density-based methods.

The authors derived a new variational formula for Rényi divergences, generalizing the classical Donsker-Varadhan result, and applied it to develop consistent neural network estimators that avoid density estimation, demonstrating effectiveness in systems up to 5000 dimensions.

We derive a new variational formula for the Rényi family of divergences, $R_α(Q\|P)$, between probability measures $Q$ and $P$. Our result generalizes the classical Donsker-Varadhan variational formula for the Kullback-Leibler divergence. We further show that this Rényi variational formula holds over a range of function spaces; this leads to a formula for the optimizer under very weak assumptions and is also key in our development of a consistency theory for Rényi divergence estimators. By applying this theory to neural-network estimators, we show that if a neural network family satisfies one of several strengthened versions of the universal approximation property then the corresponding Rényi divergence estimator is consistent. In contrast to density-estimator based methods, our estimators involve only expectations under $Q$ and $P$ and hence are more effective in high dimensional systems. We illustrate this via several numerical examples of neural network estimation in systems of up to 5000 dimensions.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes