Efficient Adaptive Experimentation with Noncompliance
This work addresses a methodological challenge in causal inference for researchers conducting adaptive experiments with noncompliance, representing an incremental advancement by optimizing allocation rules and estimation.
The paper tackles the problem of estimating average treatment effects in adaptive experiments with noncompliance, where treatment is encouraged via an instrumental variable rather than directly assigned. It introduces AMRIV, an adaptive estimator with variance-optimal assignment, which achieves improved efficiency and robustness compared to existing methods, as demonstrated in empirical studies.
We study the problem of estimating the average treatment effect (ATE) in adaptive experiments where treatment can only be encouraged -- rather than directly assigned -- via a binary instrumental variable. Building on semiparametric efficiency theory, we derive the efficiency bound for ATE estimation under arbitrary, history-dependent instrument-assignment policies, and show it is minimized by a variance-aware allocation rule that balances outcome noise and compliance variability. Leveraging this insight, we introduce AMRIV -- an Adaptive, Multiply-Robust estimator for Instrumental-Variable settings with variance-optimal assignment. AMRIV pairs (i) an online policy that adaptively approximates the optimal allocation with (ii) a sequential, influence-function-based estimator that attains the semiparametric efficiency bound while retaining multiply-robust consistency. We establish asymptotic normality, explicit convergence rates, and anytime-valid asymptotic confidence sequences that enable sequential inference. Finally, we demonstrate the practical effectiveness of our approach through empirical studies, showing that adaptive instrument assignment, when combined with the AMRIV estimator, yields improved efficiency and robustness compared to existing baselines.