Attuned to Change: Causal Fine-Tuning under Latent-Confounded Shifts
This addresses a core challenge in AI for improving model robustness under data shifts, though it is incremental as it builds on existing causal modeling and fine-tuning approaches.
The paper tackled the problem of adapting pre-trained foundation models to latent-confounded shifts, where spurious correlations in data cause failures at deployment, and showed that their causal fine-tuning method outperforms domain generalization baselines on semi-synthetic benchmarks.
Adapting to latent-confounded shifts remains a core challenge in modern AI. These shifts are propagated via latent variables that induce spurious, non-transportable correlations between inputs and labels. One practical failure mode arises when fine-tuning pre-trained foundation models on confounded data (e.g., where certain text tokens or image backgrounds spuriously correlate with the label), leaving models vulnerable at deployment. We frame causal fine-tuning as an identification problem and pose an explicit causal model that decomposes inputs into low-level spurious features and high-level causal representations. Under this family of models, we formalize the assumptions required for identification. Using pre-trained language models as a case study, we show how identifying and adjusting these components during causal fine-tuning enables automatic adaptation to latent-confounded shifts at test time. Experiments on semi-synthetic benchmarks derived from real-world problems demonstrate that our method outperforms black-box domain generalization baselines, illustrating the benefits of explicitly modeling causal structure.