Metrizing Weak Convergence with Maximum Mean Discrepancies
This provides theoretical foundations for kernel methods in machine learning, addressing a specific gap in understanding MMD metrics, but is incremental as it builds on existing work.
The paper characterizes which maximum mean discrepancy (MMD) kernels metrize weak convergence of probability measures, proving necessary and sufficient conditions on locally compact non-compact spaces, and corrects a prior result by showing counterexamples.
This paper characterizes the maximum mean discrepancies (MMD) that metrize the weak convergence of probability measures for a wide class of kernels. More precisely, we prove that, on a locally compact, non-compact, Hausdorff space, the MMD of a bounded continuous Borel measurable kernel k, whose reproducing kernel Hilbert space (RKHS) functions vanish at infinity, metrizes the weak convergence of probability measures if and only if k is continuous and integrally strictly positive definite (i.s.p.d.) over all signed, finite, regular Borel measures. We also correct a prior result of Simon-Gabriel & Schölkopf (JMLR, 2018, Thm.12) by showing that there exist both bounded continuous i.s.p.d. kernels that do not metrize weak convergence and bounded continuous non-i.s.p.d. kernels that do metrize it.