LGCVMLJul 25, 2016

A Statistical Test for Joint Distributions Equivalence

arXiv:1607.07270v11 citations
Originality Synthesis-oriented
AI Analysis

This provides a method for verifying dataset-shift in machine learning, which is incremental as it builds on existing kernel tests.

The paper tackles the problem of determining if two joint distributions are statistically different using a distribution-free test, extending kernel two-sample tests to joint distributions and enabling verification of dataset-shift in learning frameworks without assumptions about the shift type.

We provide a distribution-free test that can be used to determine whether any two joint distributions $p$ and $q$ are statistically different by inspection of a large enough set of samples. Following recent efforts from Long et al. [1], we rely on joint kernel distribution embedding to extend the kernel two-sample test of Gretton et al. [2] to the case of joint probability distributions. Our main result can be directly applied to verify if a dataset-shift has occurred between training and test distributions in a learning framework, without further assuming the shift has occurred only in the input, in the target or in the conditional distribution.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes