IRLGSep 24, 2023

Design Principles of Robust Multi-Armed Bandit Framework in Video Recommendations

arXiv:2310.01419v1h-index: 11
Originality Incremental advance
AI Analysis

This addresses robustness challenges in video recommendations, but it is incremental as it builds on existing bandit frameworks with new design principles.

The paper tackled the problem of making multi-armed bandit models in recommender systems robust to distributional changes and item cannibalization, resulting in improved relative gains of up to 11.88% in ROC-AUC and 44.85% in PR-AUC compared to a baseline.

Current multi-armed bandit approaches in recommender systems (RS) have focused more on devising effective exploration techniques, while not adequately addressing common exploitation challenges related to distributional changes and item cannibalization. Little work exists to guide the design of robust bandit frameworks that can address these frequent challenges in RS. In this paper, we propose a new design principles to (i) make bandit models robust to time-variant metadata signals, (ii) less prone to item cannibalization, and (iii) prevent their weights fluctuating due to data sparsity. Through a series of experiments, we systematically examine the influence of several important bandit design choices. We demonstrate the advantage of our proposed design principles at making bandit models robust to dynamic behavioral changes through in-depth analyses. Noticeably, we show improved relative gain compared to a baseline bandit model not incorporating our design choices of up to $11.88\%$ and $44.85\%$, respectively in ROC-AUC and PR-AUC. Case studies about fairness in recommending specific popular and unpopular titles are presented, to demonstrate the robustness of our proposed design at addressing popularity biases.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes