LG DS MLDec 14, 2024

Stochastic $k$-Submodular Bandits with Full Bandit Feedback

Guanyu Nie, Vaneet Aggarwal, Christopher John Quinn

arXiv:2412.10682v14.61 citationsh-index: 7AAMAS

Originality Incremental advance

AI Analysis

This work addresses online optimization challenges in combinatorial settings for researchers in machine learning and operations research, but it is incremental as it builds on existing offline-to-online frameworks.

The paper tackles the problem of online k-submodular optimization with full-bandit feedback by proposing algorithms for various constraints, achieving the first sublinear α-regret bounds.

In this paper, we present the first sublinear $α$-regret bounds for online $k$-submodular optimization problems with full-bandit feedback, where $α$ is a corresponding offline approximation ratio. Specifically, we propose online algorithms for multiple $k$-submodular stochastic combinatorial multi-armed bandit problems, including (i) monotone functions and individual size constraints, (ii) monotone functions with matroid constraints, (iii) non-monotone functions with matroid constraints, (iv) non-monotone functions without constraints, and (v) monotone functions without constraints. We transform approximation algorithms for offline $k$-submodular maximization problems into online algorithms through the offline-to-online framework proposed by Nie et al. (2023a). A key contribution of our work is analyzing the robustness of the offline algorithms.

View on arXiv PDF

Similar