LG AI MLFeb 28, 2018

Model-Based Value Estimation for Efficient Model-Free Reinforcement Learning

Vladimir Feinberg, Alvin Wan, Ion Stoica, Michael I. Jordan, Joseph E. Gonzalez, Sergey Levine

arXiv:1803.00101v134.0350 citationsh-index: 163

Originality Incremental advance

AI Analysis

This work addresses sample efficiency in reinforcement learning for continuous control tasks, representing an incremental improvement over existing methods.

The paper tackles the problem of high sample complexity in model-free reinforcement learning by incorporating learned dynamics models more effectively, resulting in improved value estimation and reduced sample requirements.

Recent model-free reinforcement learning algorithms have proposed incorporating learned dynamics models as a source of additional data with the intention of reducing sample complexity. Such methods hold the promise of incorporating imagined data coupled with a notion of model uncertainty to accelerate the learning of continuous control tasks. Unfortunately, they rely on heuristics that limit usage of the dynamics model. We present model-based value expansion, which controls for uncertainty in the model by only allowing imagination to fixed depth. By enabling wider use of learned dynamics models within a model-free reinforcement learning algorithm, we improve value estimation, which, in turn, reduces the sample complexity of learning.

View on arXiv PDF

Similar