LGAIMLNov 5, 2021

Empirical analysis of representation learning and exploration in neural kernel bandits

arXiv:2111.03543v2
Originality Incremental advance
AI Analysis

This work addresses the computational and performance challenges in neural bandits for sequential decision tasks, offering incremental improvements with practical applications in applied settings.

The authors tackled the problem of improving neural bandits by using neural kernel Gaussian processes for better uncertainty estimates and faster training, achieving state-of-the-art performance on nonlinear structured data. They also introduced a framework to separately measure representation learning and exploration abilities in bandit algorithms.

Neural bandits have been shown to provide an efficient solution to practical sequential decision tasks that have nonlinear reward functions. The main contributor to that success is approximate Bayesian inference, which enables neural network (NN) training with uncertainty estimates. However, Bayesian NNs often suffer from a prohibitive computational overhead or operate on a subset of parameters. Alternatively, certain classes of infinite neural networks were shown to directly correspond to Gaussian processes (GP) with neural kernels (NK). NK-GPs provide accurate uncertainty estimates and can be trained faster than most Bayesian NNs. We propose to guide common bandit policies with NK distributions and show that NK bandits achieve state-of-the-art performance on nonlinear structured data. Moreover, we propose a framework for measuring independently the ability of a bandit algorithm to learn representations and explore, and use it to analyze the impact of NK distributions w.r.t.~those two aspects. We consider policies based on a GP and a Student's t-process (TP). Furthermore, we study practical considerations, such as training frequency and model partitioning. We believe our work will help better understand the impact of utilizing NKs in applied settings.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes