LG IT MLMay 5, 2020

Measuring the Discrepancy between Conditional Distributions: Methods, Properties and Applications

Shujian Yu, Ammar Shaker, Francesco Alesiani, Jose C. Principe

arXiv:2005.02196v27.22 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the challenge of measuring conditional distribution differences for machine learning practitioners, offering a novel method that is incremental in building upon existing correntropy and Bregman divergence techniques.

The authors tackled the problem of quantifying the discrepancy between two conditional distributions by proposing a new test statistic that avoids explicit high-dimensional distribution estimation and incorporates high-order statistics. They demonstrated its utility in multi-task learning, concept drift detection, and feature selection, with code provided for implementation.

We propose a simple yet powerful test statistic to quantify the discrepancy between two conditional distributions. The new statistic avoids the explicit estimation of the underlying distributions in highdimensional space and it operates on the cone of symmetric positive semidefinite (SPS) matrix using the Bregman matrix divergence. Moreover, it inherits the merits of the correntropy function to explicitly incorporate high-order statistics in the data. We present the properties of our new statistic and illustrate its connections to prior art. We finally show the applications of our new statistic on three different machine learning problems, namely the multi-task learning over graphs, the concept drift detection, and the information-theoretic feature selection, to demonstrate its utility and advantage. Code of our statistic is available at https://bit.ly/BregmanCorrentropy.

View on arXiv PDF Code

Similar