LG MEJul 27, 2023

Rapid and Scalable Bayesian AB Testing

Srivas Chennu, Andrew Maher, Christian Pangerl, Subash Prabanantham, Jae Hyeon Bae, Jamie Martin, Bud Goswami

arXiv:2307.14628v12.01 citationsh-index: 27

Originality Incremental advance

AI Analysis

This work addresses practical challenges in AB testing for business operators in the technology industry, offering an incremental improvement over existing sequential methods.

The paper tackles the limitations of traditional AB testing methods, such as low statistical power in multivariate designs and inability to pool knowledge from past tests, by proposing a hierarchical Bayesian estimation approach that increases statistical power, enables sequential testing with early stopping, and accelerates future tests through composite global learnings.

AB testing aids business operators with their decision making, and is considered the gold standard method for learning from data to improve digital user experiences. However, there is usually a gap between the requirements of practitioners, and the constraints imposed by the statistical hypothesis testing methodologies commonly used for analysis of AB tests. These include the lack of statistical power in multivariate designs with many factors, correlations between these factors, the need of sequential testing for early stopping, and the inability to pool knowledge from past tests. Here, we propose a solution that applies hierarchical Bayesian estimation to address the above limitations. In comparison to current sequential AB testing methodology, we increase statistical power by exploiting correlations between factors, enabling sequential testing and progressive early stopping, without incurring excessive false positive risk. We also demonstrate how this methodology can be extended to enable the extraction of composite global learnings from past AB tests, to accelerate future tests. We underpin our work with a solid theoretical framework that articulates the value of hierarchical estimation. We demonstrate its utility using both numerical simulations and a large set of real-world AB tests. Together, these results highlight the practical value of our approach for statistical inference in the technology industry.

View on arXiv PDF

Similar