ML LGJun 5, 2019

Unbiased estimators for the variance of MMD estimators

arXiv:1906.02104v311.820 citations

Originality Synthesis-oriented

AI Analysis

This provides a more accurate statistical tool for practitioners in machine learning and statistics dealing with distribution comparisons, though it is an incremental correction to prior work.

The paper tackles the problem of biased variance estimation for maximum mean discrepancy (MMD) estimators in two-sample testing, showing that an unbiased estimator can be derived at no extra computational cost.

The maximum mean discrepancy (MMD) is a kernel-based distance between probability distributions useful in many applications (Gretton et al. 2012), bearing a simple estimator with pleasing computational and statistical properties. Being able to efficiently estimate the variance of this estimator is very helpful to various problems in two-sample testing. Towards this end, Bounliphone et al. (2016) used the theory of U-statistics to derive estimators for the variance of an MMD estimator, and differences between two such estimators. Their estimator, however, drops lower-order terms, and is unnecessarily biased. We show in this note - extending and correcting work of Sutherland et al. (2017) - that we can find a truly unbiased estimator for the actual variance of both the squared MMD estimator and the difference of two correlated squared MMD estimators, at essentially no additional computational cost.

View on arXiv PDF

Similar