LGMLOct 28, 2019

Online Continuous Submodular Maximization: From Full-Information to Bandit Feedback

arXiv:1910.12424v168 citations
Originality Incremental advance
AI Analysis

This work addresses optimization problems in machine learning and AI, particularly for online decision-making with submodular functions, offering incremental improvements in efficiency and regret bounds.

The paper tackles online continuous submodular maximization by proposing algorithms that reduce gradient evaluations and achieve regret bounds, with Mono-Frank-Wolfe achieving O(T^{4/5}) regret and Bandit-Frank-Wolfe achieving O(T^{8/9}) regret in bandit settings.

In this paper, we propose three online algorithms for submodular maximisation. The first one, Mono-Frank-Wolfe, reduces the number of per-function gradient evaluations from $T^{1/2}$ [Chen2018Online] and $T^{3/2}$ [chen2018projection] to 1, and achieves a $(1-1/e)$-regret bound of $O(T^{4/5})$. The second one, Bandit-Frank-Wolfe, is the first bandit algorithm for continuous DR-submodular maximization, which achieves a $(1-1/e)$-regret bound of $O(T^{8/9})$. Finally, we extend Bandit-Frank-Wolfe to a bandit algorithm for discrete submodular maximization, Responsive-Frank-Wolfe, which attains a $(1-1/e)$-regret bound of $O(T^{8/9})$ in the responsive bandit setting.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes