LG MLOct 28, 2019

Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback

Mingrui Zhang, Lin Chen, Hamed Hassani, Amin Karbasi

arXiv:1910.12424v114.068 citations

Originality Incremental advance

AI Analysis

This work addresses optimization problems in machine learning and AI, particularly for online decision-making with submodular functions, offering incremental improvements in efficiency and regret bounds.

The paper tackles online continuous submodular maximization by proposing algorithms that reduce gradient evaluations and achieve regret bounds, with Mono-Frank-Wolfe achieving O(T^{4/5}) regret and Bandit-Frank-Wolfe achieving O(T^{8/9}) regret in bandit settings.

In this paper, we propose three online algorithms for submodular maximisation. The first one, Mono-Frank-Wolfe, reduces the number of per-function gradient evaluations from $T^{1/2}$ [Chen2018Online] and $T^{3/2}$ [chen2018projection] to 1, and achieves a $(1-1/e)$-regret bound of $O(T^{4/5})$. The second one, Bandit-Frank-Wolfe, is the first bandit algorithm for continuous DR-submodular maximization, which achieves a $(1-1/e)$-regret bound of $O(T^{8/9})$. Finally, we extend Bandit-Frank-Wolfe to a bandit algorithm for discrete submodular maximization, Responsive-Frank-Wolfe, which attains a $(1-1/e)$-regret bound of $O(T^{8/9})$ in the responsive bandit setting.

View on arXiv PDF

Similar