Convolutional neural networks for valid and efficient causal inference
This work addresses causal inference challenges in epidemiology and social sciences by improving estimation accuracy with structured data, though it is incremental as it adapts CNNs to an existing semiparametric framework.
The authors tackled the problem of estimating causal effects with time-structured covariates by using convolutional neural networks (CNNs) to fit nuisance models in semiparametric estimation, resulting in an augmented inverse probability weighting estimator that provides efficient and uniformly valid inference, as demonstrated in a Monte Carlo study and an application to Swedish population data on early retirement and hospitalization.
Convolutional neural networks (CNN) have been successful in machine learning applications. Their success relies on their ability to consider space invariant local features. We consider the use of CNN to fit nuisance models in semiparametric estimation of the average causal effect of a treatment. In this setting, nuisance models are functions of pre-treatment covariates that need to be controlled for. In an application where we want to estimate the effect of early retirement on a health outcome, we propose to use CNN to control for time-structured covariates. Thus, CNN is used when fitting nuisance models explaining the treatment and the outcome. These fits are then combined into an augmented inverse probability weighting estimator yielding efficient and uniformly valid inference. Theoretically, we contribute by providing rates of convergence for CNN equipped with the rectified linear unit activation function and compare it to an existing result for feedforward neural networks. We also show when those rates guarantee uniformly valid inference. A Monte Carlo study is provided where the performance of the proposed estimator is evaluated and compared with other strategies. Finally, we give results on a study of the effect of early retirement on hospitalization using data covering the whole Swedish population.