Fast EXP3 Algorithms
This work addresses computational bottlenecks for researchers and practitioners in online learning, though it is incremental as it builds on the well-known EXP3 algorithm.
The paper tackles the computational inefficiency of the EXP3 algorithm in multi-armed bandit problems by proposing constant-time implementations per round, resulting in practical algorithms with analyzed trade-offs between regret bounds and time complexities.
We point out that EXP3 can be implemented in constant time per round, propose more practical algorithms, and analyze the trade-offs between the regret bounds and time complexities of these algorithms.