LGSep 27, 2022

Falsification before Extrapolation in Causal Effect Estimation

Zeshan Hussain, Michael Oberst, Ming-Chieh Shih, David Sontag

arXiv:2209.13708v311.111 citationsh-index: 52Has Code

Originality Incremental advance

AI Analysis

This addresses the challenge of causal effect estimation for policy-making in populations not covered by RCTs, offering a method to improve reliability, though it is incremental by building on existing meta-analysis and causal inference techniques.

The paper tackles the problem of estimating causal effects for broader populations using observational data when RCTs are limited, by proposing a meta-algorithm that rejects biased observational estimates and provides conservative confidence intervals with coverage guarantees. It demonstrates favorable performance compared to standard meta-analysis techniques on semi-synthetic and real-world datasets.

Randomized Controlled Trials (RCTs) represent a gold standard when developing policy guidelines. However, RCTs are often narrow, and lack data on broader populations of interest. Causal effects in these populations are often estimated using observational datasets, which may suffer from unobserved confounding and selection bias. Given a set of observational estimates (e.g. from multiple studies), we propose a meta-algorithm that attempts to reject observational estimates that are biased. We do so using validation effects, causal effects that can be inferred from both RCT and observational data. After rejecting estimators that do not pass this test, we generate conservative confidence intervals on the extrapolated causal effects for subgroups not observed in the RCT. Under the assumption that at least one observational estimator is asymptotically normal and consistent for both the validation and extrapolated effects, we provide guarantees on the coverage probability of the intervals output by our algorithm. To facilitate hypothesis testing in settings where causal effect transportation across datasets is necessary, we give conditions under which a doubly-robust estimator of group average treatment effects is asymptotically normal, even when flexible machine learning methods are used for estimation of nuisance parameters. We illustrate the properties of our approach on semi-synthetic and real world datasets, and show that it compares favorably to standard meta-analysis techniques.

View on arXiv PDF Code

Similar