LGAIIRAug 12, 2024

Learned Ranking Function: From Short-term Behavior Predictions to Long-term User Satisfaction

arXiv:2408.06512v14 citationsh-index: 7
Originality Incremental advance
AI Analysis

This work addresses the challenge of improving user retention and engagement in large-scale platforms like YouTube by moving beyond heuristic-based methods.

The paper tackles the problem of optimizing long-term user satisfaction in recommendation systems by introducing the Learned Ranking Function (LRF), which uses short-term behavior predictions to generate slates, and reports deployment on YouTube with live experiments.

We present the Learned Ranking Function (LRF), a system that takes short-term user-item behavior predictions as input and outputs a slate of recommendations that directly optimizes for long-term user satisfaction. Most previous work is based on optimizing the hyperparameters of a heuristic function. We propose to model the problem directly as a slate optimization problem with the objective of maximizing long-term user satisfaction. We also develop a novel constraint optimization algorithm that stabilizes objective trade-offs for multi-objective optimization. We evaluate our approach with live experiments and describe its deployment on YouTube.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes