Pseudo-Labeling for Unsupervised Domain Adaptation with Kernel GLMs
This addresses domain adaptation for kernel GLMs, offering a principled method for scenarios with covariate shift, though it is incremental as it builds on existing pseudo-labeling and model selection techniques.
The paper tackles unsupervised domain adaptation under covariate shift in kernel Generalized Linear Models by proposing a pseudo-labeling framework that leverages labeled source and unlabeled target data to minimize prediction error, achieving consistent performance gains over baselines in experiments.
We propose a principled framework for unsupervised domain adaptation under covariate shift in kernel Generalized Linear Models (GLMs), encompassing kernelized linear, logistic, and Poisson regression with ridge regularization. Our goal is to minimize prediction error in the target domain by leveraging labeled source data and unlabeled target data, despite differences in covariate distributions. We partition the labeled source data into two batches: one for training a family of candidate models, and the other for building an imputation model. This imputation model generates pseudo-labels for the target data, enabling robust model selection. We establish non-asymptotic excess-risk bounds that characterize adaptation performance through an "effective labeled sample size", explicitly accounting for the unknown covariate shift. Experiments on synthetic and real datasets demonstrate consistent performance gains over source-only baselines.