MLLGJun 12, 2025

Practical Improvements of A/B Testing with Off-Policy Estimation

arXiv:2506.10677v21 citationsh-index: 6
Originality Incremental advance
AI Analysis

This provides a practical improvement for practitioners in online experimentation and decision-making, though it is incremental as it builds on existing A/B testing methods.

The paper tackles the problem of improving A/B testing by reducing variance in effect estimation, introducing a family of unbiased off-policy estimators that achieve lower variance than the standard difference-in-means approach, with the best estimator offering substantial variance reduction when tested systems are similar.

We address the problem of A/B testing, a widely used protocol for evaluating the potential improvement achieved by a new decision system compared to a baseline. This protocol segments the population into two subgroups, each exposed to a version of the system and estimates the improvement as the difference between the measured effects. In this work, we demonstrate that the commonly used difference-in-means estimator, while unbiased, can be improved. We introduce a family of unbiased off-policy estimators that achieves lower variance than the standard approach. Among this family, we identify the estimator with the lowest variance. The resulting estimator is simple, and offers substantial variance reduction when the two tested systems exhibit similarities. Our theoretical analysis and experimental results validate the effectiveness and practicality of the proposed method.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes