OCLGMLJun 10, 2024

Online Newton Method for Bandit Convex Optimisation

arXiv:2406.06506v19 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficient optimization with limited feedback for researchers in online learning and optimization, representing an incremental improvement with specific computational gains.

The paper tackles the problem of bandit convex optimization by introducing a computationally efficient algorithm, achieving a regret bound of at most $d^{3.5} \sqrt{n} \mathrm{polylog}(n, d)$ in the adversarial setting and $M d^{2} \sqrt{n} \mathrm{polylog}(n, d)$ in the stochastic setting, where $d$ is dimension and $n$ is time horizon.

We introduce a computationally efficient algorithm for zeroth-order bandit convex optimisation and prove that in the adversarial setting its regret is at most $d^{3.5} \sqrt{n} \mathrm{polylog}(n, d)$ with high probability where $d$ is the dimension and $n$ is the time horizon. In the stochastic setting the bound improves to $M d^{2} \sqrt{n} \mathrm{polylog}(n, d)$ where $M \in [d^{-1/2}, d^{-1 / 4}]$ is a constant that depends on the geometry of the constraint set and the desired computational properties.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes