ITDATA-ANMLNov 7, 2014

Efficient Estimation of Mutual Information for Strongly Dependent Variables

arXiv:1411.2003v3217 citations
AI Analysis

This addresses a critical bottleneck for researchers and practitioners in fields like machine learning and statistics who rely on accurate mutual information estimation for dependent variables.

The paper tackles the problem that existing nonparametric mutual information estimators require exponentially large sample sizes for strongly dependent variables, and introduces a new estimator that is robust to local non-uniformity and works well with limited data.

We demonstrate that a popular class of nonparametric mutual information (MI) estimators based on k-nearest-neighbor graphs requires number of samples that scales exponentially with the true MI. Consequently, accurate estimation of MI between two strongly dependent variables is possible only for prohibitively large sample size. This important yet overlooked shortcoming of the existing estimators is due to their implicit reliance on local uniformity of the underlying joint distribution. We introduce a new estimator that is robust to local non-uniformity, works well with limited data, and is able to capture relationship strengths over many orders of magnitude. We demonstrate the superior performance of the proposed estimator on both synthetic and real-world data.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes