Combining Interventional and Observational Data Using Causal Reductions
This addresses the problem of causal inference under unobserved confounding for researchers and practitioners, offering a method that is incremental in combining existing techniques like normalizing flows with causal reductions.
The paper tackles the challenge of unobserved confounding in causal effect estimation by proposing a causal reduction method that replaces multiple latent confounders with a single one, enabling estimation from combined observational and interventional data without assuming all confounders are observed. The result shows that in experiments on synthetic data, adding observational samples can reduce the number of interventional samples needed without losing accuracy.
Unobserved confounding is one of the main challenges when estimating causal effects. We propose a causal reduction method that, given a causal model, replaces an arbitrary number of possibly high-dimensional latent confounders with a single latent confounder that takes values in the same space as the treatment variable, without changing the observational and interventional distributions the causal model entails. This allows us to estimate the causal effect in a principled way from combined data without relying on the common but often unrealistic assumption that all confounders have been observed. We apply our causal reduction in three different settings. In the first setting, we assume the treatment and outcome to be discrete. The causal reduction then implies bounds between the observational and interventional distributions that can be exploited for estimation purposes. In certain cases with highly unbalanced observational samples, the accuracy of the causal effect estimate can be improved by incorporating observational data. Second, for continuous variables and assuming a linear-Gaussian model, we derive equality constraints for the parameters of the observational and interventional distributions. Third, for the general continuous setting (possibly nonlinear and non-Gaussian), we parameterize the reduced causal model using normalizing flows, a flexible class of easily invertible nonlinear transformations. We perform a series of experiments on synthetic data and find that in several cases the number of interventional samples can be reduced when adding observational training samples without sacrificing accuracy.