ML LGJun 12, 2025

Practical Improvements of A/B Testing with Off-Policy Estimation

Otmane Sakhi, Alexandre Gilotte, David Rohde

arXiv:2506.10677v27.81 citationsh-index: 6

Originality Incremental advance

AI Analysis

This provides a practical improvement for practitioners in online experimentation and decision-making, though it is incremental as it builds on existing A/B testing methods.

The paper tackles the problem of improving A/B testing by reducing variance in effect estimation, introducing a family of unbiased off-policy estimators that achieve lower variance than the standard difference-in-means approach, with the best estimator offering substantial variance reduction when tested systems are similar.

We address the problem of A/B testing, a widely used protocol for evaluating the potential improvement achieved by a new decision system compared to a baseline. This protocol segments the population into two subgroups, each exposed to a version of the system and estimates the improvement as the difference between the measured effects. In this work, we demonstrate that the commonly used difference-in-means estimator, while unbiased, can be improved. We introduce a family of unbiased off-policy estimators that achieves lower variance than the standard approach. Among this family, we identify the estimator with the lowest variance. The resulting estimator is simple, and offers substantial variance reduction when the two tested systems exhibit similarities. Our theoretical analysis and experimental results validate the effectiveness and practicality of the proposed method.

View on arXiv PDF

Similar