MLJan 30, 2015

Confidence intervals for AB-test

arXiv:1501.07768v11 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for accurate confidence interval estimation in A/B testing for web companies, though it appears incremental as it builds on existing statistical methods like bootstrapping and the central limit theorem.

The authors tackled the problem of determining when to stop an A/B test by developing a mathematical framework and three algorithms for computing reliable confidence intervals, which apply to various metrics beyond simple success probabilities, including absolute and relative increments for events like click-through rates.

AB-testing is a very popular technique in web companies since it makes it possible to accurately predict the impact of a modification with the simplicity of a random split across users. One of the critical aspects of an AB-test is its duration and it is important to reliably compute confidence intervals associated with the metric of interest to know when to stop the test. In this paper, we define a clean mathematical framework to model the AB-test process. We then propose three algorithms based on bootstrapping and on the central limit theorem to compute reliable confidence intervals which extend to other metrics than the common probabilities of success. They apply to both absolute and relative increments of the most used comparison metrics, including the number of occurrences of a particular event and a click-through rate implying a ratio.

View on arXiv PDF

Similar