A Simple Unified Framework for High Dimensional Bandit Problems
This provides a general solution for online decision-making in applications like advertising and drug discovery, though it appears incremental as it extends existing frameworks.
The authors proposed a unified algorithm for high-dimensional bandit problems with low-dimensional structures, achieving comparable regret bounds in LASSO bandit and novel bounds in low-rank matrix, group sparse matrix, and multi-agent LASSO bandit problems.
Stochastic high dimensional bandit problems with low dimensional structures are useful in different applications such as online advertising and drug discovery. In this work, we propose a simple unified algorithm for such problems and present a general analysis framework for the regret upper bound of our algorithm. We show that under some mild unified assumptions, our algorithm can be applied to different high dimensional bandit problems. Our framework utilizes the low dimensional structure to guide the parameter estimation in the problem, therefore our algorithm achieves the comparable regret bounds in the LASSO bandit, as well as novel bounds in the low-rank matrix bandit, the group sparse matrix bandit, and in a new problem: the multi-agent LASSO bandit.