Estimating Causal Effects from Learned Causal Networks
This addresses a computational bottleneck in causal inference for researchers and practitioners, though it appears incremental as it builds on existing probabilistic graphical model algorithms.
The paper tackles the problem of answering causal-effect queries by proposing to learn causal Bayesian networks directly from observational data instead of generating estimands, showing that this approach can be more effective for larger models where estimands become computationally difficult.
The standard approach to answering an identifiable causal-effect query (e.g., $P(Y|do(X)$) when given a causal diagram and observational data is to first generate an estimand, or probabilistic expression over the observable variables, which is then evaluated using the observational data. In this paper, we propose an alternative paradigm for answering causal-effect queries over discrete observable variables. We propose to instead learn the causal Bayesian network and its confounding latent variables directly from the observational data. Then, efficient probabilistic graphical model (PGM) algorithms can be applied to the learned model to answer queries. Perhaps surprisingly, we show that this \emph{model completion} learning approach can be more effective than estimand approaches, particularly for larger models in which the estimand expressions become computationally difficult. We illustrate our method's potential using a benchmark collection of Bayesian networks and synthetically generated causal models.