LG AI IRAug 12, 2024

Learned Ranking Function: From Short-term Behavior Predictions to Long-term User Satisfaction

Yi Wu, Daryl Chang, Jennifer She, Zhe Zhao, Li Wei, Lukasz Heldt

arXiv:2408.06512v16.44 citationsh-index: 7

Originality Incremental advance

AI Analysis

This work addresses the challenge of improving user retention and engagement in large-scale platforms like YouTube by moving beyond heuristic-based methods.

The paper tackles the problem of optimizing long-term user satisfaction in recommendation systems by introducing the Learned Ranking Function (LRF), which uses short-term behavior predictions to generate slates, and reports deployment on YouTube with live experiments.

We present the Learned Ranking Function (LRF), a system that takes short-term user-item behavior predictions as input and outputs a slate of recommendations that directly optimizes for long-term user satisfaction. Most previous work is based on optimizing the hyperparameters of a heuristic function. We propose to model the problem directly as a slate optimization problem with the objective of maximizing long-term user satisfaction. We also develop a novel constraint optimization algorithm that stabilizes objective trade-offs for multi-objective optimization. We evaluate our approach with live experiments and describe its deployment on YouTube.

View on arXiv PDF

Similar