STCOMLOct 19, 2020

Reweighting samples under covariate shift using a Wasserstein distance criterion

arXiv:2010.09267v21 citations
Originality Incremental advance
AI Analysis

This work addresses covariate shift in statistical learning, offering a method for sample reweighting that is incremental but improves upon existing approaches by relaxing assumptions.

The paper tackles the problem of reweighting samples under covariate shift by minimizing the Wasserstein distance between empirical distributions, leading to weights expressed via Nearest Neighbors. It provides consistency and asymptotic convergence rates without requiring absolute continuity assumptions, with applications in Uncertainty Quantification and generalization error bounds for Nearest Neighbor Regression.

Considering two random variables with different laws to which we only have access through finite size iid samples, we address how to reweight the first sample so that its empirical distribution converges towards the true law of the second sample as the size of both samples goes to infinity. We study an optimal reweighting that minimizes the Wasserstein distance between the empirical measures of the two samples, and leads to an expression of the weights in terms of Nearest Neighbors. The consistency and some asymptotic convergence rates in terms of expected Wasserstein distance are derived, and do not need the assumption of absolute continuity of one random variable with respect to the other. These results have some application in Uncertainty Quantification for decoupled estimation and in the bound of the generalization error for the Nearest Neighbor Regression under covariate shift.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes