Should I Stop or Should I Go: Early Stopping with Heterogeneous Populations
This addresses the issue of unintended harm in experiments for researchers and practitioners, offering a novel approach to improve safety in heterogeneous populations.
The paper tackled the problem of early stopping in randomized experiments when treatments harm minority groups, by developing CLASH, a method that accounts for treatment effect heterogeneity, and demonstrated its effectiveness in simulations and real data for clinical trials and A/B tests.
Randomized experiments often need to be stopped prematurely due to the treatment having an unintended harmful effect. Existing methods that determine when to stop an experiment early are typically applied to the data in aggregate and do not account for treatment effect heterogeneity. In this paper, we study the early stopping of experiments for harm on heterogeneous populations. We first establish that current methods often fail to stop experiments when the treatment harms a minority group of participants. We then use causal machine learning to develop CLASH, the first broadly-applicable method for heterogeneous early stopping. We demonstrate CLASH's performance on simulated and real data and show that it yields effective early stopping for both clinical trials and A/B tests.