LG MLJun 1, 2020

Submodular Bandit Problem Under Multiple Constraints

Sho Takemori, Masahiro Sato, Takashi Sonoda, Janmajay Singh, Tomoko Ohkuma

arXiv:2006.00661v511.118 citationsh-index: 17

Originality Incremental advance

AI Analysis

This work addresses the problem of online learning with diversified recommendations under complex constraints for recommender systems, representing an incremental advancement in the field.

The paper tackles the submodular bandit problem under multiple constraints, such as knapsacks and k-system constraints, to improve diversified retrieval in recommender systems, and demonstrates that the proposed non-greedy algorithm outperforms existing baselines in experiments with synthetic and real-world datasets.

The linear submodular bandit problem was proposed to simultaneously address diversified retrieval and online learning in a recommender system. If there is no uncertainty, this problem is equivalent to a submodular maximization problem under a cardinality constraint. However, in some situations, recommendation lists should satisfy additional constraints such as budget constraints, other than a cardinality constraint. Thus, motivated by diversified retrieval considering budget constraints, we introduce a submodular bandit problem under the intersection of $l$ knapsacks and a $k$-system constraint. Here $k$-system constraints form a very general class of constraints including cardinality constraints and the intersection of $k$ matroid constraints. To solve this problem, we propose a non-greedy algorithm that adaptively focuses on a standard or modified upper-confidence bound. We provide a high-probability upper bound of an approximation regret, where the approximation ratio matches that of a fast offline algorithm. Moreover, we perform experiments under various combinations of constraints using a synthetic and two real-world datasets and demonstrate that our proposed methods outperform the existing baselines.

View on arXiv PDF

Similar