Adapting to Latent Subgroup Shifts via Concepts and Proxies
This addresses domain adaptation for scenarios where traditional shift assumptions fail, offering a solution for applications with latent confounding, though it appears incremental as it builds on existing identification frameworks.
The paper tackles unsupervised domain adaptation under latent subgroup shifts, where source and target domains differ due to a confounding latent subgroup, and shows that the optimal target predictor can be identified using source-domain concepts and proxies plus unlabeled target data, outperforming covariate and label shift methods.
We address the problem of unsupervised domain adaptation when the source domain differs from the target domain because of a shift in the distribution of a latent subgroup. When this subgroup confounds all observed data, neither covariate shift nor label shift assumptions apply. We show that the optimal target predictor can be non-parametrically identified with the help of concept and proxy variables available only in the source domain, and unlabeled data from the target. The identification results are constructive, immediately suggesting an algorithm for estimating the optimal predictor in the target. For continuous observations, when this algorithm becomes impractical, we propose a latent variable model specific to the data generation process at hand. We show how the approach degrades as the size of the shift changes, and verify that it outperforms both covariate and label shift adjustment.