LGOCJun 1, 2021

Minimax Regret for Bandit Convex Optimisation of Ridge Functions

arXiv:2106.00444v26 citations
Originality Incremental advance
AI Analysis

This provides a theoretical bound for a specific bandit optimization problem, which is incremental as it builds on existing work by restricting the adversary to ridge functions.

The paper tackles the problem of adversarial bandit convex optimization with ridge functions, where the adversary plays functions of the form f_t(x) = g_t(⟨x, θ⟩) with unknown θ, and proves that the minimax regret is at most O(d √n log(n diam(K))).

We analyse adversarial bandit convex optimisation with an adversary that is restricted to playing functions of the form $f_t(x) = g_t(\langle x, θ\rangle)$ for convex $g_t : \mathbb R \to \mathbb R$ and unknown $θ\in \mathbb R^d$ that is homogeneous over time. We provide a short information-theoretic proof that the minimax regret is at most $O(d \sqrt{n} \log(n \operatorname{diam}(\mathcal K)))$ where $n$ is the number of interactions, $d$ the dimension and $\operatorname{diam}(\mathcal K)$ is the diameter of the constraint set.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes