MLIRLGOCJun 30, 2022

Ranking In Generalized Linear Bandits

Oxford
arXiv:2207.00109v21 citationsh-index: 89
Originality Incremental advance
AI Analysis

This work addresses the challenge of optimizing ordered lists in recommendation systems, which is incremental as it generalizes existing studies on position dependencies and connects to graph theory.

The paper tackles the ranking problem in generalized linear bandits by modeling position and item dependencies in ordered lists, such as in recommendation systems, and designs UCB and Thompson Sampling algorithms to address this complexity.

We study the ranking problem in generalized linear bandits. At each time, the learning agent selects an ordered list of items and observes stochastic outcomes. In recommendation systems, displaying an ordered list of the most attractive items is not always optimal as both position and item dependencies result in a complex reward function. A very naive example is the lack of diversity when all the most attractive items are from the same category. We model the position and item dependencies in the ordered list and design UCB and Thompson Sampling type algorithms for this problem. Our work generalizes existing studies in several directions, including position dependencies where position discount is a particular case, and connecting the ranking problem to graph theory.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes