Regularized Learning for Domain Adaptation under Label Shifts
This addresses domain adaptation for machine learning practitioners when label distributions differ between source and target domains, offering incremental improvements over existing methods.
The paper tackles domain adaptation under label shifts by proposing RLLS, a method that estimates importance weights and trains a classifier on weighted source samples, improving classification accuracy on CIFAR-10 and MNIST datasets, particularly in low-sample and large-shift scenarios.
We propose Regularized Learning under Label shifts (RLLS), a principled and a practical domain-adaptation algorithm to correct for shifts in the label distribution between a source and a target domain. We first estimate importance weights using labeled source data and unlabeled target data, and then train a classifier on the weighted source samples. We derive a generalization bound for the classifier on the target domain which is independent of the (ambient) data dimensions, and instead only depends on the complexity of the function class. To the best of our knowledge, this is the first generalization bound for the label-shift problem where the labels in the target domain are not available. Based on this bound, we propose a regularized estimator for the small-sample regime which accounts for the uncertainty in the estimated weights. Experiments on the CIFAR-10 and MNIST datasets show that RLLS improves classification accuracy, especially in the low sample and large-shift regimes, compared to previous methods.