Robust Importance Weighting for Covariate Shift
This addresses covariate shift for machine learning practitioners, offering a robust solution that combines the strengths of existing methods, though it is incremental in nature.
The paper tackles the problem of covariate shift in learning by proposing a new estimator that integrates nonparametric regression residuals with kernel mean matching reweighting, which strictly outperforms or matches existing rates for both methods and performs well in experiments.
In many learning problems, the training and testing data follow different distributions and a particularly common situation is the \textit{covariate shift}. To correct for sampling biases, most approaches, including the popular kernel mean matching (KMM), focus on estimating the importance weights between the two distributions. Reweighting-based methods, however, are exposed to high variance when the distributional discrepancy is large and the weights are poorly estimated. On the other hand, the alternate approach of using nonparametric regression (NR) incurs high bias when the training size is limited. In this paper, we propose and analyze a new estimator that systematically integrates the residuals of NR with KMM reweighting, based on a control-variate perspective. The proposed estimator can be shown to either strictly outperform or match the best-known existing rates for both KMM and NR, and thus is a robust combination of both estimators. The experiments shows the estimator works well in practice.