Finite-Sample Analysis of Fixed-k Nearest Neighbor Density Functional Estimators
This work addresses the challenge of efficient density functional estimation for researchers in statistics and machine learning, offering incremental improvements over previous methods.
The paper tackles the problem of estimating functionals of nonparametric continuous probability densities using k-nearest neighbor statistics by introducing a bias correction with fixed k, which improves computational efficiency and, in some cases, statistical efficiency, leading to faster convergence rates.
We provide finite-sample analysis of a general framework for using k-nearest neighbor statistics to estimate functionals of a nonparametric continuous probability density, including entropies and divergences. Rather than plugging a consistent density estimate (which requires $k \to \infty$ as the sample size $n \to \infty$) into the functional of interest, the estimators we consider fix k and perform a bias correction. This is more efficient computationally, and, as we show in certain cases, statistically, leading to faster convergence rates. Our framework unifies several previous estimators, for most of which ours are the first finite sample guarantees.