LG OCJun 1, 2021

Minimax Regret for Bandit Convex Optimisation of Ridge Functions

arXiv:2106.00444v29.26 citations

Originality Incremental advance

AI Analysis

This provides a theoretical bound for a specific bandit optimization problem, which is incremental as it builds on existing work by restricting the adversary to ridge functions.

The paper tackles the problem of adversarial bandit convex optimization with ridge functions, where the adversary plays functions of the form f_t(x) = g_t(⟨x, θ⟩) with unknown θ, and proves that the minimax regret is at most O(d √n log(n diam(K))).

We analyse adversarial bandit convex optimisation with an adversary that is restricted to playing functions of the form $f_t(x) = g_t(\langle x, θ\rangle)$ for convex $g_t : \mathbb R \to \mathbb R$ and unknown $θ\in \mathbb R^d$ that is homogeneous over time. We provide a short information-theoretic proof that the minimax regret is at most $O(d \sqrt{n} \log(n \operatorname{diam}(\mathcal K)))$ where $n$ is the number of interactions, $d$ the dimension and $\operatorname{diam}(\mathcal K)$ is the diameter of the constraint set.

View on arXiv PDF

Similar