Semi-Generative Modelling: Covariate-Shift Adaptation with Cause and Effect Features
This addresses domain adaptation for machine learning models under covariate shift, but it is incremental as it builds on existing causal and semi-supervised methods.
The paper tackles covariate-shift adaptation by combining it with semi-supervised learning, using causal features to create a semi-generative model; experiments on synthetic data show significant improvements in classification over supervised and importance-weighting baselines.
Current methods for covariate-shift adaptation use unlabelled data to compute importance weights or domain-invariant features, while the final model is trained on labelled data only. Here, we consider a particular case of covariate shift which allows us also to learn from unlabelled data, that is, combining adaptation with semi-supervised learning. Using ideas from causality, we argue that this requires learning with both causes, $X_C$, and effects, $X_E$, of a target variable, $Y$, and show how this setting leads to what we call a semi-generative model, $P(Y,X_E|X_C,θ)$. Our approach is robust to domain shifts in the distribution of causal features and leverages unlabelled data by learning a direct map from causes to effects. Experiments on synthetic data demonstrate significant improvements in classification over purely-supervised and importance-weighting baselines.