MLLGDec 11, 2018

Bounding the Error From Reference Set Kernel Maximum Mean Discrepancy

arXiv:1812.04594v12 citations
Originality Incremental advance
AI Analysis

This addresses computational efficiency in statistical testing for researchers, though it appears incremental as it builds on existing kernel methods.

The paper tackles the error in two-sample testing when using weighted skeletonization of datasets with kernel maximum mean discrepancy, providing a non-asymptotic bound based on heat diffusion and weight uniformity. It demonstrates the method on several test examples.

In this paper, we bound the error induced by using a weighted skeletonization of two data sets for computing a two sample test with kernel maximum mean discrepancy. The error is quantified in terms of the speed in which heat diffuses from those points to the rest of the data, as well as how at the weights on the reference points are, and gives a non-asymptotic, non-probabilistic bound. The result ties into the problem of the eigenvector triple product, which appears in a number of important problems. The error bound also suggests an optimization scheme for choosing the best set of reference points and weights. The method is tested on a several two sample test examples.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes