MLAILGMar 22, 2017

Deep Exploration via Randomized Value Functions

arXiv:1703.07608v5336 citations
Originality Incremental advance
AI Analysis

This work addresses the exploration problem in reinforcement learning for researchers and practitioners, offering a method that combines statistical efficiency with practical value function learning, though it appears incremental in nature.

The paper tackles the challenge of efficient exploration in reinforcement learning by using randomized value functions, and demonstrates the approach's efficacy through computational studies and a proven regret bound for tabular representations.

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to value function learning. We present several reinforcement learning algorithms that leverage randomized value functions and demonstrate their efficacy through computational studies. We also prove a regret bound that establishes statistical efficiency with a tabular representation.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes