ML LG STNov 8, 2020

High-Dimensional Sparse Linear Bandits

arXiv:2011.04020v223.575 citations

Originality Incremental advance

AI Analysis

This work addresses optimization challenges in personalized medicine and online advertising by establishing optimal regret rates for sparse linear bandits, though it is incremental as it complements existing bounds for data-rich regimes.

The paper tackles the problem of stochastic linear bandits with high-dimensional sparse features, deriving a novel dimension-free minimax regret lower bound of Ω(n^{2/3}) in the data-poor regime and providing a nearly matching upper bound of Θ(n^{2/3}) for an explore-then-commit algorithm, with an additional O(√n) bound under certain signal assumptions.

Stochastic linear bandits with high-dimensional sparse features are a practical model for a variety of domains, including personalized medicine and online advertising. We derive a novel $Ω(n^{2/3})$ dimension-free minimax regret lower bound for sparse linear bandits in the data-poor regime where the horizon is smaller than the ambient dimension and where the feature vectors admit a well-conditioned exploration distribution. This is complemented by a nearly matching upper bound for an explore-then-commit algorithm showing that that $Θ(n^{2/3})$ is the optimal rate in the data-poor regime. The results complement existing bounds for the data-rich regime and provide another example where carefully balancing the trade-off between information and regret is necessary. Finally, we prove a dimension-free $O(\sqrt{n})$ regret upper bound under an additional assumption on the magnitude of the signal for relevant features.

View on arXiv PDF

Similar