LGDSMLDec 14, 2024

Stochastic $k$-Submodular Bandits with Full Bandit Feedback

arXiv:2412.10682v11 citationsh-index: 7AAMAS
Originality Incremental advance
AI Analysis

This work addresses online optimization challenges in combinatorial settings for researchers in machine learning and operations research, but it is incremental as it builds on existing offline-to-online frameworks.

The paper tackles the problem of online k-submodular optimization with full-bandit feedback by proposing algorithms for various constraints, achieving the first sublinear α-regret bounds.

In this paper, we present the first sublinear $α$-regret bounds for online $k$-submodular optimization problems with full-bandit feedback, where $α$ is a corresponding offline approximation ratio. Specifically, we propose online algorithms for multiple $k$-submodular stochastic combinatorial multi-armed bandit problems, including (i) monotone functions and individual size constraints, (ii) monotone functions with matroid constraints, (iii) non-monotone functions with matroid constraints, (iv) non-monotone functions without constraints, and (v) monotone functions without constraints. We transform approximation algorithms for offline $k$-submodular maximization problems into online algorithms through the offline-to-online framework proposed by Nie et al. (2023a). A key contribution of our work is analyzing the robustness of the offline algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes