LGITMLApr 11, 2016

Demystifying Fixed k-Nearest Neighbor Information Estimators

arXiv:1604.03006v2150 citations
AI Analysis

This work addresses a foundational statistical problem for researchers and practitioners in fields relying on mutual information estimation, offering incremental theoretical insights and an enhanced estimator.

The paper tackles the lack of theoretical understanding of the widely used KSG mutual information estimator by proving its consistency and bounding its bias convergence rate, and introduces a modified estimator that leverages a 'correlation boosting' effect for improved performance.

Estimating mutual information from i.i.d. samples drawn from an unknown joint density function is a basic statistical problem of broad interest with multitudinous applications. The most popular estimator is one proposed by Kraskov and Stögbauer and Grassberger (KSG) in 2004, and is nonparametric and based on the distances of each sample to its $k^{\rm th}$ nearest neighboring sample, where $k$ is a fixed small integer. Despite its widespread use (part of scientific software packages), theoretical properties of this estimator have been largely unexplored. In this paper we demonstrate that the estimator is consistent and also identify an upper bound on the rate of convergence of the bias as a function of number of samples. We argue that the superior performance benefits of the KSG estimator stems from a curious "correlation boosting" effect and build on this intuition to modify the KSG estimator in novel ways to construct a superior estimator. As a byproduct of our investigations, we obtain nearly tight rates of convergence of the $\ell_2$ error of the well known fixed $k$ nearest neighbor estimator of differential entropy by Kozachenko and Leonenko.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes