ML LG MEMay 17, 2018

The Blessings of Multiple Causes

arXiv:1805.06826v3316 citations

Originality Incremental advance

AI Analysis

This addresses a fundamental challenge in causal inference for scientific studies with multiple causes, offering a checkable approach that is incremental but improves upon standard untestable assumptions.

The paper tackles the problem of causal inference from observational data when not all confounders are observed, by proposing the deconfounder algorithm that uses unsupervised learning to infer latent variables as substitutes for unobserved confounders, resulting in a method that requires weaker assumptions than classical approaches and is validated on semi-simulated and real datasets.

Causal inference from observational data often assumes "ignorability," that all confounders are observed. This assumption is standard yet untestable. However, many scientific studies involve multiple causes, different variables whose effects are simultaneously of interest. We propose the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. The deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference. We develop theory for the deconfounder, and show that it requires weaker assumptions than classical causal inference. We analyze its performance in three types of studies: semi-simulated data around smoking and lung cancer, semi-simulated data around genome-wide association studies, and a real dataset about actors and movie revenue. The deconfounder provides a checkable approach to estimating closer-to-truth causal effects.

View on arXiv PDF

Similar