MLLGJan 20

Unified Unbiased Variance Estimation for MMD: Robust Finite-Sample Performance with Imbalanced Data and Exact Acceleration under Null and Alternative Hypotheses

arXiv:2601.13874v1h-index: 2
Originality Incremental advance
AI Analysis

This work addresses a critical bottleneck in kernel-based nonparametric testing for researchers and practitioners dealing with imbalanced data, offering improved inferential accuracy and computational efficiency, though it is incremental as it builds on existing U-statistic and Hoeffding decomposition frameworks.

The paper tackles the problem of variance estimation for the maximum mean discrepancy (MMD) statistic in two-sample testing, establishing a unified finite-sample characterization that covers different hypotheses and sample configurations, and proposes an exact acceleration method that reduces computational complexity from O(n^2) to O(n log n) for the univariate case under the Laplacian kernel.

The maximum mean discrepancy (MMD) is a kernel-based nonparametric statistic for two-sample testing, whose inferential accuracy depends critically on variance characterization. Existing work provides various finite-sample estimators of the MMD variance, often differing under the null and alternative hypotheses and across balanced or imbalanced sampling schemes. In this paper, we study the variance of the MMD statistic through its U-statistic representation and Hoeffding decomposition, and establish a unified finite-sample characterization covering different hypotheses and sample configurations. Building on this analysis, we propose an exact acceleration method for the univariate case under the Laplacian kernel, which reduces the overall computational complexity from $\mathcal O(n^2)$ to $\mathcal O(n \log n)$.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes