ML LGAug 31, 2022

An evaluation framework for comparing causal inference models

arXiv:2209.00115v17.97 citationsh-index: 8

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of robust benchmarking for researchers in causal inference, but it is incremental as it builds on existing evaluation metrics.

The authors tackled the challenge of evaluating causal inference models by proposing a framework that uses statistical evidence, performance profiles, and tests to reduce the influence of outliers in benchmarking. They applied this methodology to compare several state-of-the-art models, though no specific numerical results are provided.

Estimation of causal effects is the core objective of many scientific disciplines. However, it remains a challenging task, especially when the effects are estimated from observational data. Recently, several promising machine learning models have been proposed for causal effect estimation. The evaluation of these models has been based on the mean values of the error of the Average Treatment Effect (ATE) as well as of the Precision in Estimation of Heterogeneous Effect (PEHE). In this paper, we propose to complement the evaluation of causal inference models using concrete statistical evidence, including the performance profiles of Dolan and Mor{é}, as well as non-parametric and post-hoc statistical tests. The main motivation behind this approach is the elimination of the influence of a small number of instances or simulation on the benchmarking process, which in some cases dominate the results. We use the proposed evaluation methodology to compare several state-of-the-art causal effect estimation models.

View on arXiv PDF

Similar