LG AI MLNov 30, 2020

RealCause: Realistic Causal Inference Benchmarking

Brady Neal, Chin-Wei Huang, Sunand Raghupathi

arXiv:2011.15007v217.452 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the critical problem of evaluating and selecting causal effect estimators for researchers and practitioners, by providing a more realistic benchmarking environment than previous synthetic data approaches.

This paper introduces RealCause, a new benchmark for causal inference estimators that generates synthetic data with known ground-truth causal effects while maintaining realism. The authors used this benchmark to evaluate over 1500 causal estimators and found that predictive metrics can be used to rationally choose hyperparameters for these estimators.

There are many different causal effect estimators in causal inference. However, it is unclear how to choose between these estimators because there is no ground-truth for causal effects. A commonly used option is to simulate synthetic data, where the ground-truth is known. However, the best causal estimators on synthetic data are unlikely to be the best causal estimators on real data. An ideal benchmark for causal estimators would both (a) yield ground-truth values of the causal effects and (b) be representative of real data. Using flexible generative models, we provide a benchmark that both yields ground-truth and is realistic. Using this benchmark, we evaluate over 1500 different causal estimators and provide evidence that it is rational to choose hyperparameters for causal estimators using predictive metrics.

View on arXiv PDF Code

Similar