MLJan 30, 2015

Confidence intervals for AB-test

arXiv:1501.07768v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for accurate confidence interval estimation in A/B testing for web companies, though it appears incremental as it builds on existing statistical methods like bootstrapping and the central limit theorem.

The authors tackled the problem of determining when to stop an A/B test by developing a mathematical framework and three algorithms for computing reliable confidence intervals, which apply to various metrics beyond simple success probabilities, including absolute and relative increments for events like click-through rates.

AB-testing is a very popular technique in web companies since it makes it possible to accurately predict the impact of a modification with the simplicity of a random split across users. One of the critical aspects of an AB-test is its duration and it is important to reliably compute confidence intervals associated with the metric of interest to know when to stop the test. In this paper, we define a clean mathematical framework to model the AB-test process. We then propose three algorithms based on bootstrapping and on the central limit theorem to compute reliable confidence intervals which extend to other metrics than the common probabilities of success. They apply to both absolute and relative increments of the most used comparison metrics, including the number of occurrences of a particular event and a click-through rate implying a ratio.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes