IRMay 25, 2020

Cascade Model-based Propensity Estimation for Counterfactual Learning to Rank

Ali Vardasbi, Maarten de Rijke, Ilya Markov

arXiv:2005.11938v144 citations

Originality Incremental advance

AI Analysis

This addresses a specific bottleneck in unbiased learning to rank for search systems, but it is incremental as it adapts existing methods to a known user behavior model.

The paper tackles the problem of inaccurate propensity estimation in counterfactual learning to rank when user clicks follow a cascade model, proposing CM-IPS to improve performance, achieving results close to full-information performance in cascade scenarios.

Unbiased CLTR requires click propensities to compensate for the difference between user clicks and true relevance of search results via IPS. Current propensity estimation methods assume that user click behavior follows the PBM and estimate click propensities based on this assumption. However, in reality, user clicks often follow the CM, where users scan search results from top to bottom and where each next click depends on the previous one. In this cascade scenario, PBM-based estimates of propensities are not accurate, which, in turn, hurts CLTR performance. In this paper, we propose a propensity estimation method for the cascade scenario, called CM-IPS. We show that CM-IPS keeps CLTR performance close to the full-information performance in case the user clicks follow the CM, while PBM-based CLTR has a significant gap towards the full-information. The opposite is true if the user clicks follow PBM instead of the CM. Finally, we suggest a way to select between CM- and PBM-based propensity estimation methods based on historical user clicks.

View on arXiv PDF

Similar