LG MLOct 17, 2017

Robust importance-weighted cross-validation under sample selection bias

Wouter M. Kouw, Jesse H. Krijthe, Marco Loog

arXiv:1710.06514v32.02 citations

Originality Incremental advance

AI Analysis

This work addresses a specific issue in machine learning for practitioners dealing with biased data, but it is incremental as it builds on existing importance-weighting techniques.

The paper tackles the problem of sub-optimal hyperparameter estimates in cross-validation under sample selection bias when using importance-weighted risk estimators, and introduces a control variate method to increase robustness, reducing variance by up to 30% in experiments.

Cross-validation under sample selection bias can, in principle, be done by importance-weighting the empirical risk. However, the importance-weighted risk estimator produces sub-optimal hyperparameter estimates in problem settings where large weights arise with high probability. We study its sampling variance as a function of the training data distribution and introduce a control variate to increase its robustness to problematically large weights.

View on arXiv PDF

Similar