LGFeb 11, 2022

Efficient Kernel UCB for Contextual Bandits

Houssam Zenati, Alberto Bietti, Eustache Diemert, Julien Mairal, Matthieu Martin, Pierre Gaillard

arXiv:2202.05638v15.84 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses computational bottlenecks for researchers and practitioners in large-scale contextual bandit problems, representing an incremental improvement over existing methods.

The paper tackled the computational inefficiency of kernelized UCB algorithms in contextual bandits by proposing a method using incremental Nystrom approximations, achieving a complexity of O(CTm^2) compared to the standard O(CT^3), with m potentially as low as O(√T) or nearly constant to maintain the same regret.

In this paper, we tackle the computational efficiency of kernelized UCB algorithms in contextual bandits. While standard methods require a O(CT^3) complexity where T is the horizon and the constant C is related to optimizing the UCB rule, we propose an efficient contextual algorithm for large-scale problems. Specifically, our method relies on incremental Nystrom approximations of the joint kernel embedding of contexts and actions. This allows us to achieve a complexity of O(CTm^2) where m is the number of Nystrom points. To recover the same regret as the standard kernelized UCB algorithm, m needs to be of order of the effective dimension of the problem, which is at most O(\sqrt(T)) and nearly constant in some cases.

View on arXiv PDF Code

Similar