IRJul 15, 2019

To Model or to Intervene: A Comparison of Counterfactual and Online Learning to Rank from User Interactions

arXiv:1907.06412v182 citations
Originality Incremental advance
AI Analysis

This work helps practitioners in information retrieval choose between model-based and intervention-based methods for unbiased ranking, based on bias and noise conditions.

The study compared counterfactual and online learning-to-rank methods for handling bias and noise in user interactions, finding that counterfactual methods perform best with low bias or noise, while online methods are more robust in biased or noisy settings but require control over rankings.

Learning to Rank (LTR) from user interactions is challenging as user feedback often contains high levels of bias and noise. At the moment, two methodologies for dealing with bias prevail in the field of LTR: counterfactual methods that learn from historical data and model user behavior to deal with biases; and online methods that perform interventions to deal with bias but use no explicit user models. For practitioners the decision between either methodology is very important because of its direct impact on end users. Nevertheless, there has never been a direct comparison between these two approaches to unbiased LTR. In this study we provide the first benchmarking of both counterfactual and online LTR methods under different experimental conditions. Our results show that the choice between the methodologies is consequential and depends on the presence of selection bias, and the degree of position bias and interaction noise. In settings with little bias or noise counterfactual methods can obtain the highest ranking performance; however, in other circumstances their optimization can be detrimental to the user experience. Conversely, online methods are very robust to bias and noise but require control over the displayed rankings. Our findings confirm and contradict existing expectations on the impact of model-based and intervention-based methods in LTR, and allow practitioners to make an informed decision between the two methodologies.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes