Theoretical Analysis on the Efficiency of Interleaved Comparisons
This work addresses a gap in the literature for researchers and practitioners in information retrieval, though it is incremental as it builds on existing interleaving methods.
This study tackled the problem of understanding why interleaving, an online evaluation method for rankings, is efficient by providing a theoretical analysis. It found that interleaving is more efficient than A/B testing when users leave rankings based on item relevance, with experimental results confirming the theory.
This study presents a theoretical analysis on the efficiency of interleaving, an efficient online evaluation method for rankings. Although interleaving has already been applied to production systems, the source of its high efficiency has not been clarified in the literature. Therefore, this study presents a theoretical analysis on the efficiency of interleaving methods. We begin by designing a simple interleaving method similar to ordinary interleaving methods. Then, we explore a condition under which the interleaving method is more efficient than A/B testing and find that this is the case when users leave the ranking depending on the item's relevance, a typical assumption made in click models. Finally, we perform experiments based on numerical analysis and user simulation, demonstrating that the theoretical results are consistent with the empirical results.