ML IT LGFeb 23, 2018

Exponentially Consistent Kernel Two-Sample Tests

arXiv:1802.08407v21.0

Originality Highly original

AI Analysis

This provides foundational theoretical guarantees for kernel two-sample tests, impacting statistical hypothesis testing and nonparametric change detection.

The paper tackles the lack of exact asymptotic performance characterization for kernel two-sample tests, establishing that a class of such tests are exponentially consistent with an optimal decay rate for type-II error probability, independent of specific kernels under certain conditions.

Given two sets of independent samples from unknown distributions $P$ and $Q$, a two-sample test decides whether to reject the null hypothesis that $P=Q$. Recent attention has focused on kernel two-sample tests as the test statistics are easy to compute, converge fast, and have low bias with their finite sample estimates. However, there still lacks an exact characterization on the asymptotic performance of such tests, and in particular, the rate at which the type-II error probability decays to zero in the large sample limit. In this work, we establish that a class of kernel two-sample tests are exponentially consistent with Polish, locally compact Hausdorff sample space, e.g., $\mathbb R^d$. The obtained exponential decay rate is further shown to be optimal among all two-sample tests satisfying the level constraint, and is independent of particular kernels provided that they are bounded continuous and characteristic. Our results gain new insights into related issues such as fair alternative for testing and kernel selection strategy. Finally, as an application, we show that a kernel based test achieves the optimal detection for off-line change detection in the nonparametric setting.

View on arXiv PDF

Similar