Offline Evaluation of Ranking Policies with Click Models
This work addresses a practical problem for web systems like recommender systems and search engines by enabling more efficient offline evaluation of ranking policies, though it is incremental as it builds on existing click model frameworks.
The paper tackles the problem of offline evaluation of ranking policies by proposing evaluation algorithms that estimate expected clicks from logged data using click models, proving they are more efficient than prior methods under model assumptions and showing consistent outperformance in real-world experiments.
Many web systems rank and present a list of items to users, from recommender systems to search and advertising. An important problem in practice is to evaluate new ranking policies offline and optimize them before they are deployed. We address this problem by proposing evaluation algorithms for estimating the expected number of clicks on ranked lists from historical logged data. The existing algorithms are not guaranteed to be statistically efficient in our problem because the number of recommended lists can grow exponentially with their length. To overcome this challenge, we use models of user interaction with the list of items, the so-called click models, to construct estimators that learn statistically efficiently. We analyze our estimators and prove that they are more efficient than the estimators that do not use the structure of the click model, under the assumption that the click model holds. We evaluate our estimators in a series of experiments on a real-world dataset and show that they consistently outperform prior estimators.