IRLGMLJul 26, 2019

On the Value of Bandit Feedback for Offline Recommender System Evaluation

arXiv:1907.12384v111 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of accurately predicting online recommender system performance for researchers and practitioners, but it is incremental as it builds on existing bandit feedback methods.

The paper tackles the problem that traditional offline evaluation of recommender systems using next-item prediction may not accurately reflect online performance, and shows through simulated experiments that using bandit feedback (data on shown recommendations and clicks) can improve offline evaluation reliability.

In academic literature, recommender systems are often evaluated on the task of next-item prediction. The procedure aims to give an answer to the question: "Given the natural sequence of user-item interactions up to time t, can we predict which item the user will interact with at time t+1?". Evaluation results obtained through said methodology are then used as a proxy to predict which system will perform better in an online setting. The online setting, however, poses a subtly different question: "Given the natural sequence of user-item interactions up to time t, can we get the user to interact with a recommended item at time t+1?". From a causal perspective, the system performs an intervention, and we want to measure its effect. Next-item prediction is often used as a fall-back objective when information about interventions and their effects (shown recommendations and whether they received a click) is unavailable. When this type of data is available, however, it can provide great value for reliably estimating online recommender system performance. Through a series of simulated experiments with the RecoGym environment, we show where traditional offline evaluation schemes fall short. Additionally, we show how so-called bandit feedback can be exploited for effective offline evaluation that more accurately reflects online performance.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes