ML AI LGMar 22, 2017

Deep Exploration via Randomized Value Functions

Ian Osband, Benjamin Van Roy, Daniel Russo, Zheng Wen

arXiv:1703.07608v532.0336 citations

Originality Incremental advance

AI Analysis

This work addresses the exploration problem in reinforcement learning for researchers and practitioners, offering a method that combines statistical efficiency with practical value function learning, though it appears incremental in nature.

The paper tackles the challenge of efficient exploration in reinforcement learning by using randomized value functions, and demonstrates the approach's efficacy through computational studies and a proven regret bound for tabular representations.

We study the use of randomized value functions to guide deep exploration in reinforcement learning. This offers an elegant means for synthesizing statistically and computationally efficient exploration with common practical approaches to value function learning. We present several reinforcement learning algorithms that leverage randomized value functions and demonstrate their efficacy through computational studies. We also prove a regret bound that establishes statistical efficiency with a tabular representation.

View on arXiv PDF

Similar