Program Evaluation and Causal Inference with High-Dimensional Data
This addresses the problem of reliable causal inference with many control variables for researchers and practitioners in econometrics and data science, representing a strong methodological advance rather than an incremental step.
The paper tackles the challenge of performing causal inference and program evaluation in high-dimensional settings by providing efficient estimators and uniformly valid confidence bands for various treatment effects, including local average and quantile effects, and demonstrates this with an application to 401(k) effects on assets.
In this paper, we provide efficient estimators and honest confidence bands for a variety of treatment effects including local average (LATE) and local quantile treatment effects (LQTE) in data-rich environments. We can handle very many control variables, endogenous receipt of treatment, heterogeneous treatment effects, and function-valued outcomes. Our framework covers the special case of exogenous receipt of treatment, either conditional on controls or unconditionally as in randomized control trials. In the latter case, our approach produces efficient estimators and honest bands for (functional) average treatment effects (ATE) and quantile treatment effects (QTE). To make informative inference possible, we assume that key reduced form predictive relationships are approximately sparse. This assumption allows the use of regularization and selection methods to estimate those relations, and we provide methods for post-regularization and post-selection inference that are uniformly valid (honest) across a wide-range of models. We show that a key ingredient enabling honest inference is the use of orthogonal or doubly robust moment conditions in estimating certain reduced form functional parameters. We illustrate the use of the proposed methods with an application to estimating the effect of 401(k) eligibility and participation on accumulated assets.