PUATE: Efficient Average Treatment Effect Estimation from Treated (Positive) and Unlabeled Units
This work addresses ATE estimation in weakly supervised settings, which is incremental as it builds on existing methods for missing data and PU learning.
The paper tackled the problem of estimating average treatment effects (ATEs) when only treated and unlabeled units are observed, a scenario related to positive and unlabeled learning. They derived semiparametric efficiency bounds and constructed estimators that achieve these bounds, contributing to causal inference with missing data.
The estimation of average treatment effects (ATEs), defined as the difference in expected outcomes between treatment and control groups, is a central topic in causal inference. This study develops semiparametric efficient estimators for ATE in a setting where only a treatment group and an unlabeled group, consisting of units whose treatment status is unknown, are observed. This scenario constitutes a variant of learning from positive and unlabeled data (PU learning) and can be viewed as a special case of ATE estimation with missing data. For this setting, we derive the semiparametric efficiency bounds, which characterize the lowest achievable asymptotic variance for regular estimators. We then construct semiparametric efficient ATE estimators that attain these bounds. Our results contribute to the literature on causal inference with missing data and weakly supervised learning.