Estimating Interventional Distributions with Uncertain Causal Graphs through Meta-Learning
This addresses a key challenge in causal inference for scientific domains like biology and social sciences by providing a scalable solution to manage structural uncertainty, though it is incremental as it builds on existing Bayesian and meta-learning approaches.
The paper tackles the problem of estimating interventional distributions when causal graphs are uncertain, proposing MACE-TNP, a meta-learning model that predicts Bayesian model-averaged interventional posteriors without expensive computations, and empirically shows it outperforms strong Bayesian baselines.
In scientific domains -- from biology to the social sciences -- many questions boil down to \textit{What effect will we observe if we intervene on a particular variable?} If the causal relationships (e.g.~a causal graph) are known, it is possible to estimate the intervention distributions. In the absence of this domain knowledge, the causal structure must be discovered from the available observational data. However, observational data are often compatible with multiple causal graphs, making methods that commit to a single structure prone to overconfidence. A principled way to manage this structural uncertainty is via Bayesian inference, which averages over a posterior distribution on possible causal structures and functional mechanisms. Unfortunately, the number of causal structures grows super-exponentially with the number of nodes in the graph, making computations intractable. We propose to circumvent these challenges by using meta-learning to create an end-to-end model: the Model-Averaged Causal Estimation Transformer Neural Process (MACE-TNP). The model is trained to predict the Bayesian model-averaged interventional posterior distribution, and its end-to-end nature bypasses the need for expensive calculations. Empirically, we demonstrate that MACE-TNP outperforms strong Bayesian baselines. Our work establishes meta-learning as a flexible and scalable paradigm for approximating complex Bayesian causal inference, that can be scaled to increasingly challenging settings in the future.