LG CV MLJul 25, 2016

A Statistical Test for Joint Distributions Equivalence

arXiv:1607.07270v11.91 citations

Originality Synthesis-oriented

AI Analysis

This provides a method for verifying dataset-shift in machine learning, which is incremental as it builds on existing kernel tests.

The paper tackles the problem of determining if two joint distributions are statistically different using a distribution-free test, extending kernel two-sample tests to joint distributions and enabling verification of dataset-shift in learning frameworks without assumptions about the shift type.

We provide a distribution-free test that can be used to determine whether any two joint distributions $p$ and $q$ are statistically different by inspection of a large enough set of samples. Following recent efforts from Long et al. [1], we rely on joint kernel distribution embedding to extend the kernel two-sample test of Gretton et al. [2] to the case of joint probability distributions. Our main result can be directly applied to verify if a dataset-shift has occurred between training and test distributions in a learning framework, without further assuming the shift has occurred only in the input, in the target or in the conditional distribution.

View on arXiv PDF

Similar