Robust and Agnostic Learning of Conditional Distributional Treatment Effects
This addresses the need for more comprehensive causal inference in fields like economics and medicine by providing a method to assess treatment effects on distributions, though it is incremental as it builds on existing DTE concepts with new robustness and agnosticism.
The authors tackled the problem of estimating conditional distributional treatment effects (CDTE) beyond averages, which can overlook risks and tail events, by developing a robust and model-agnostic method based on pseudo-outcomes and regression learners, achieving results that allow learning CDTEs at rates dependent on class complexity and enabling inferences on linear projections.
The conditional average treatment effect (CATE) is the best measure of individual causal effects given baseline covariates. However, the CATE only captures the (conditional) average, and can overlook risks and tail events, which are important to treatment choice. In aggregate analyses, this is usually addressed by measuring the distributional treatment effect (DTE), such as differences in quantiles or tail expectations between treatment groups. Hypothetically, one can similarly fit conditional quantile regressions in each treatment group and take their difference, but this would not be robust to misspecification or provide agnostic best-in-class predictions. We provide a new robust and model-agnostic methodology for learning the conditional DTE (CDTE) for a class of problems that includes conditional quantile treatment effects, conditional super-quantile treatment effects, and conditional treatment effects on coherent risk measures given by $f$-divergences. Our method is based on constructing a special pseudo-outcome and regressing it on covariates using any regression learner. Our method is model-agnostic in that it can provide the best projection of CDTE onto the regression model class. Our method is robust in that even if we learn these nuisances nonparametrically at very slow rates, we can still learn CDTEs at rates that depend on the class complexity and even conduct inferences on linear projections of CDTEs. We investigate the behavior of our proposal in simulations, as well as in a case study of 401(k) eligibility effects on wealth.