STMLMar 11, 2021

Non-Asymptotic Performance Guarantees for Neural Estimation of $\mathsf{f}$-Divergences

arXiv:2103.06923v223 citations
AI Analysis

This work addresses a theoretical gap for practitioners using neural divergence estimators, though it is incremental as it builds on existing variational methods.

The paper tackles the performance guarantees of neural network estimators for f-divergences, deriving non-asymptotic error bounds that quantify the tradeoff between approximation and estimation errors, with numerical validation provided.

Statistical distances (SDs), which quantify the dissimilarity between probability distributions, are central to machine learning and statistics. A modern method for estimating such distances from data relies on parametrizing a variational form by a neural network (NN) and optimizing it. These estimators are abundantly used in practice, but corresponding performance guarantees are partial and call for further exploration. In particular, there seems to be a fundamental tradeoff between the two sources of error involved: approximation and estimation. While the former needs the NN class to be rich and expressive, the latter relies on controlling complexity. This paper explores this tradeoff by means of non-asymptotic error bounds, focusing on three popular choices of SDs -- Kullback-Leibler divergence, chi-squared divergence, and squared Hellinger distance. Our analysis relies on non-asymptotic function approximation theorems and tools from empirical process theory. Numerical results validating the theory are also provided.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes