Causal Discovery in Heterogeneous Environments Under the Sparse Mechanism Shift Hypothesis
This addresses the challenge of causal discovery in non-i.i.d. settings for machine learning practitioners, offering a nonparametric approach that leverages sparse changes, though it is incremental in building on existing causal frameworks.
The paper tackles the problem of learning causal structure from heterogeneous data with distribution shifts by proposing the Mechanism Shift Score (MSS), a score-based method that provably identifies the full causal graph under the sparse mechanism shift hypothesis, showing empirical improvements over other methods.
Machine learning approaches commonly rely on the assumption of independent and identically distributed (i.i.d.) data. In reality, however, this assumption is almost always violated due to distribution shifts between environments. Although valuable learning signals can be provided by heterogeneous data from changing distributions, it is also known that learning under arbitrary (adversarial) changes is impossible. Causality provides a useful framework for modeling distribution shifts, since causal models encode both observational and interventional distributions. In this work, we explore the sparse mechanism shift hypothesis, which posits that distribution shifts occur due to a small number of changing causal conditionals. Motivated by this idea, we apply it to learning causal structure from heterogeneous environments, where i.i.d. data only allows for learning an equivalence class of graphs without restrictive assumptions. We propose the Mechanism Shift Score (MSS), a score-based approach amenable to various empirical estimators, which provably identifies the entire causal structure with high probability if the sparse mechanism shift hypothesis holds. Empirically, we verify behavior predicted by the theory and compare multiple estimators and score functions to identify the best approaches in practice. Compared to other methods, we show how MSS bridges a gap by both being nonparametric as well as explicitly leveraging sparse changes.