LGMLApr 21, 2022

Provably Efficient Kernelized Q-Learning

arXiv:2204.10349v14 citationsh-index: 14
Originality Incremental advance
AI Analysis

This work addresses the challenge of efficient reinforcement learning in complex environments, offering a provably efficient method with practical gains, though it is incremental as it builds on kernel methods and Q-learning.

The authors tackled the problem of Q-learning in high-dimensional spaces by proposing a kernelized version, deriving regret bounds for arbitrary kernels and showing that with a Gaussian RBF kernel, it achieves good performance in control tasks after only 1000 steps, outperforming deep Q-learning.

We propose and analyze a kernelized version of Q-learning. Although a kernel space is typically infinite-dimensional, extensive study has shown that generalization is only affected by the effective dimension of the data. We incorporate such ideas into the Q-learning framework and derive regret bounds for arbitrary kernels. In particular, we provide concrete bounds for linear kernels and Gaussian RBF kernels; notably, the latter bound looks almost identical to the former, only that the actual dimension is replaced by a different notion of dimensionality. Finally, we test our algorithm on a suite of classic control tasks; remarkably, under the Gaussian RBF kernel, it achieves reasonably good performance after only 1000 environmental steps, while its neural network counterpart, deep Q-learning, still struggles.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes