Long Story Short: Omitted Variable Bias in Causal Machine Learning
This work addresses the problem of omitted variable bias in causal machine learning for empirical researchers, providing interpretable tools for sensitivity analysis, though it is incremental in extending existing bias theories to a broader class of models.
The authors developed a general theory of omitted variable bias for various causal parameters in machine learning models, showing that simple plausibility judgments on omitted variables can bound the bias and enabling flexible sensitivity analysis with efficient inference methods. They demonstrated the approach with two empirical examples.
We develop a general theory of omitted variable bias for a wide range of common causal parameters, including (but not limited to) averages of potential outcomes, average treatment effects, average causal derivatives, and policy effects from covariate shifts. Our theory applies to nonparametric models, while naturally allowing for (semi-)parametric restrictions (such as partial linearity) when such assumptions are made. We show how simple plausibility judgments on the maximum explanatory power of omitted variables are sufficient to bound the magnitude of the bias, thus facilitating sensitivity analysis in otherwise complex, nonlinear models. Finally, we provide flexible and efficient statistical inference methods for the bounds, which can leverage modern machine learning algorithms for estimation. These results allow empirical researchers to perform sensitivity analyses in a flexible class of machine-learned causal models using very simple, and interpretable, tools. We demonstrate the utility of our approach with two empirical examples.