Towards Scalable Bayesian Learning of Causal DAGs
This work addresses the challenge of scalable causal discovery for researchers and practitioners in fields like statistics and machine learning, offering incremental improvements in efficiency and performance.
The paper tackles the problem of Bayesian inference of causal directed acyclic graphs (DAGs) and causal effects from observed data, presenting algorithmic improvements that enable efficient sampling with larger candidate parent sets and a novel Bayesian method for causal effect estimation, which outperforms previous approaches in experiments.
We give methods for Bayesian inference of directed acyclic graphs, DAGs, and the induced causal effects from passively observed complete data. Our methods build on a recent Markov chain Monte Carlo scheme for learning Bayesian networks, which enables efficient approximate sampling from the graph posterior, provided that each node is assigned a small number $K$ of candidate parents. We present algorithmic techniques to significantly reduce the space and time requirements, which make the use of substantially larger values of $K$ feasible. Furthermore, we investigate the problem of selecting the candidate parents per node so as to maximize the covered posterior mass. Finally, we combine our sampling method with a novel Bayesian approach for estimating causal effects in linear Gaussian DAG models. Numerical experiments demonstrate the performance of our methods in detecting ancestor-descendant relations, and in causal effect estimation our Bayesian method is shown to outperform previous approaches.