IR LGMay 18, 2020

Non-Clicks Mean Irrelevant? Propensity Ratio Scoring As a Correction

Nan Wang, Zhen Qin, Xuanhui Wang, Hongning Wang

arXiv:2005.08480v211.734 citations

Originality Incremental advance

AI Analysis

This work addresses bias in learning to rank for search systems, offering a more effective method for using click data, though it is incremental as it builds on existing inverse propensity scoring approaches.

The paper tackled bias in unbiased learning to rank by addressing the assumption that non-clicked documents are irrelevant, which leads to unnecessary comparisons between relevant documents and hinders optimization. It introduced Propensity Ratio Scoring (PRS), a new weighting scheme that corrects bias for both clicks and non-clicks, resulting in improved performance on synthetic benchmarks and real-world GMail search data.

Recent advances in unbiased learning to rank (LTR) count on Inverse Propensity Scoring (IPS) to eliminate bias in implicit feedback. Though theoretically sound in correcting the bias introduced by treating clicked documents as relevant, IPS ignores the bias caused by (implicitly) treating non-clicked ones as irrelevant. In this work, we first rigorously prove that such use of click data leads to unnecessary pairwise comparisons between relevant documents, which prevent unbiased ranker optimization. Based on the proof, we derive a simple yet well justified new weighting scheme, called Propensity Ratio Scoring (PRS), which provides treatments on both clicks and non-clicks. Besides correcting the bias in clicks, PRS avoids relevant-relevant document comparisons in LTR training and enjoys a lower variability. Our extensive empirical evaluations confirm that PRS ensures a more effective use of click data and improved performance in both synthetic data from a set of LTR benchmarks, as well as in the real-world large-scale data from GMail search.

View on arXiv PDF

Similar