Tractable Instances of Bilinear Maximization: Implementing LinUCB on Ellipsoids
This work addresses a key computational bottleneck in linear bandits, allowing for efficient implementation of optimistic algorithms in high-dimensional settings, which is incremental as it builds on existing bandit theory but solves a specific tractability issue.
The paper tackles the problem of efficiently maximizing a bilinear function over convex sets and ellipsoids, which is fundamental for linear bandits, by showing that no efficient algorithms exist for some sets like ℓ_p balls with p>2 unless P=NP, and providing two novel algorithms that solve it efficiently when the set is a centered ellipsoid, enabling the first known method to implement optimistic algorithms for linear bandits in high dimensions.
We consider the maximization of $x^\top θ$ over $(x,θ) \in \mathcal{X} \times Θ$, with $\mathcal{X} \subset \mathbb{R}^d$ convex and $Θ\subset \mathbb{R}^d$ an ellipsoid. This problem is fundamental in linear bandits, as the learner must solve it at every time step using optimistic algorithms. We first show that for some sets $\mathcal{X}$ e.g. $\ell_p$ balls with $p>2$, no efficient algorithms exist unless $\mathcal{P} = \mathcal{NP}$. We then provide two novel algorithms solving this problem efficiently when $\mathcal{X}$ is a centered ellipsoid. Our findings provide the first known method to implement optimistic algorithms for linear bandits in high dimensions.