Amortized Inference for Causal Structure Learning
This work addresses the computational and design challenges in causal discovery for researchers in fields like genomics, offering a data-driven approach that improves efficiency and performance.
The authors tackled the costly combinatorial search problem in causal structure learning by training a variational inference model to directly predict causal structures from data, bypassing hand-engineered scores and graph searches. Their model demonstrated robust generalization to larger instances and distribution shifts, outperforming existing algorithms on synthetic and semisynthetic gene expression data.
Inferring causal structure poses a combinatorial search problem that typically involves evaluating structures with a score or independence test. The resulting search is costly, and designing suitable scores or tests that capture prior knowledge is difficult. In this work, we propose to amortize causal structure learning. Rather than searching over structures, we train a variational inference model to directly predict the causal structure from observational or interventional data. This allows our inference model to acquire domain-specific inductive biases for causal discovery solely from data generated by a simulator, bypassing both the hand-engineering of suitable score functions and the search over graphs. The architecture of our inference model emulates permutation invariances that are crucial for statistical efficiency in structure learning, which facilitates generalization to significantly larger problem instances than seen during training. On synthetic data and semisynthetic gene expression data, our models exhibit robust generalization capabilities when subject to substantial distribution shifts and significantly outperform existing algorithms, especially in the challenging genomics domain. Our code and models are publicly available at: https://github.com/larslorch/avici.