Scaling up Ranking under Constraints for Live Recommendations by Replacing Optimization with Prediction
This work addresses latency issues in personalized content recommendations, enabling larger-scale real-time decision-making, though it is incremental as it builds on existing constrained ranking frameworks.
The paper tackles the problem of real-time constrained ranking for live recommendations by replacing optimization with prediction, achieving a major reduction in computing resources while maintaining constraint compliance and utility within a 50-millisecond requirement.
Many important multiple-objective decision problems can be cast within the framework of ranking under constraints and solved via a weighted bipartite matching linear program. Some of these optimization problems, such as personalized content recommendations, may need to be solved in real time and thus must comply with strict time requirements to prevent the perception of latency by consumers. Classical linear programming is too computationally inefficient for such settings. We propose a novel approach to scale up ranking under constraints by replacing the weighted bipartite matching optimization with a prediction problem in the algorithm deployment stage. We show empirically that the proposed approximate solution to the ranking problem leads to a major reduction in required computing resources without much sacrifice in constraint compliance and achieved utility, allowing us to solve larger constrained ranking problems real-time, within the required 50 milliseconds, than previously reported.