Variance Minimization in the Wasserstein Space for Invariant Causal Prediction
This work addresses a computational bottleneck in causal inference for researchers and practitioners, offering a more scalable method for identifying causal predictors, though it is incremental in improving existing ICP approaches.
The paper tackles the computational inefficiency of invariant causal prediction (ICP), which scales exponentially with the number of variables, by reformulating it as a series of nonparametric tests that scale linearly. They introduce a novel Wasserstein variance loss function and prove recovery of causal sets under mild assumptions, showing competitive performance with benchmarks.
Selecting powerful predictors for an outcome is a cornerstone task for machine learning. However, some types of questions can only be answered by identifying the predictors that causally affect the outcome. A recent approach to this causal inference problem leverages the invariance property of a causal mechanism across differing experimental environments (Peters et al., 2016; Heinze-Deml et al., 2018). This method, invariant causal prediction (ICP), has a substantial computational defect -- the runtime scales exponentially with the number of possible causal variables. In this work, we show that the approach taken in ICP may be reformulated as a series of nonparametric tests that scales linearly in the number of predictors. Each of these tests relies on the minimization of a novel loss function -- the Wasserstein variance -- that is derived from tools in optimal transport theory and is used to quantify distributional variability across environments. We prove under mild assumptions that our method is able to recover the set of identifiable direct causes, and we demonstrate in our experiments that it is competitive with other benchmark causal discovery algorithms.