Sufficient Dimension Reduction for Average Causal Effect Estimation
This addresses the challenge of high-dimensional covariate sets in causal inference for researchers and practitioners, offering a method to improve estimation reliability, though it is incremental as it builds on existing dimension reduction and matching techniques.
The paper tackles the problem of unreliable causal effect estimation with many covariates by proving that a lower-dimensional representation can capture all necessary adjustment information, and demonstrates the effectiveness of a kernel-based reduction algorithm with nearest neighbor matching on semi-synthetic and real-world datasets.
Having a large number of covariates can have a negative impact on the quality of causal effect estimation since confounding adjustment becomes unreliable when the number of covariates is large relative to the samples available. Propensity score is a common way to deal with a large covariate set, but the accuracy of propensity score estimation (normally done by logistic regression) is also challenged by large number of covariates. In this paper, we prove that a large covariate set can be reduced to a lower dimensional representation which captures the complete information for adjustment in causal effect estimation. The theoretical result enables effective data-driven algorithms for causal effect estimation. We develop an algorithm which employs a supervised kernel dimension reduction method to search for a lower dimensional representation for the original covariates, and then utilizes nearest neighbor matching in the reduced covariate space to impute the counterfactual outcomes to avoid large-sized covariate set problem. The proposed algorithm is evaluated on two semi-synthetic and three real-world datasets and the results have demonstrated the effectiveness of the algorithm.