Estimating Conditional Average Treatment Effects via Sufficient Representation Learning
This addresses a key issue in causal inference for fields like medicine or economics by improving CATE estimation accuracy, though it appears incremental as it builds on existing representation learning methods.
The paper tackles the problem of estimating conditional average treatment effects (CATE) in high-dimensional data by proposing CrossNet, a neural network approach that learns sufficient representations and cross-utilizes data from both treatment and control groups, resulting in outperformance over competitive methods in simulations and empirical tests.
Estimating the conditional average treatment effects (CATE) is very important in causal inference and has a wide range of applications across many fields. In the estimation process of CATE, the unconfoundedness assumption is typically required to ensure the identifiability of the regression problems. When estimating CATE using high-dimensional data, there have been many variable selection methods and neural network approaches based on representation learning, while these methods do not provide a way to verify whether the subset of variables after dimensionality reduction or the learned representations still satisfy the unconfoundedness assumption during the estimation process, which can lead to ineffective estimates of the treatment effects. Additionally, these methods typically use data from only the treatment or control group when estimating the regression functions for each group. This paper proposes a novel neural network approach named \textbf{CrossNet} to learn a sufficient representation for the features, based on which we then estimate the CATE, where cross indicates that in estimating the regression functions, we used data from their own group as well as cross-utilized data from another group. Numerical simulations and empirical results demonstrate that our method outperforms the competitive approaches.