CausalBench: A Large-scale Benchmark for Network Inference from Single-cell Perturbation Data
This provides a principled benchmark for researchers in causal inference and biology to track progress in network inference from real-world data, though it is incremental as it focuses on evaluation rather than new methods.
The authors tackled the challenge of evaluating causal inference methods in real-world settings by introducing CausalBench, a benchmark suite using real-world single-cell perturbation data, and found that current methods have poor scalability and do not benefit from interventional information as expected.
Causal inference is a vital aspect of multiple scientific disciplines and is routinely applied to high-impact applications such as medicine. However, evaluating the performance of causal inference methods in real-world environments is challenging due to the need for observations under both interventional and control conditions. Traditional evaluations conducted on synthetic datasets do not reflect the performance in real-world systems. To address this, we introduce CausalBench, a benchmark suite for evaluating network inference methods on real-world interventional data from large-scale single-cell perturbation experiments. CausalBench incorporates biologically-motivated performance metrics, including new distribution-based interventional metrics. A systematic evaluation of state-of-the-art causal inference methods using our CausalBench suite highlights how poor scalability of current methods limits performance. Moreover, methods that use interventional information do not outperform those that only use observational data, contrary to what is observed on synthetic benchmarks. Thus, CausalBench opens new avenues in causal network inference research and provides a principled and reliable way to track progress in leveraging real-world interventional data.