LG MLOct 21, 2019

Momentum in Reinforcement Learning

Nino Vieillard, Bruno Scherrer, Olivier Pietquin, Matthieu Geist

arXiv:1910.09322v213.436 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of stabilizing and accelerating convergence in reinforcement learning for practitioners, but it appears incremental as it builds on existing methods like Value Iteration and DQN.

The authors tackled the problem of improving reinforcement learning algorithms by adapting the optimization concept of momentum to value iteration, resulting in Momentum Value Iteration (MoVI) and a DQN-based improvement that was tested on Atari games, though no concrete performance numbers were provided.

We adapt the optimization's concept of momentum to reinforcement learning. Seeing the state-action value functions as an analog to the gradients in optimization, we interpret momentum as an average of consecutive $q$-functions. We derive Momentum Value Iteration (MoVI), a variation of Value Iteration that incorporates this momentum idea. Our analysis shows that this allows MoVI to average errors over successive iterations. We show that the proposed approach can be readily extended to deep learning. Specifically, we propose a simple improvement on DQN based on MoVI, and experiment it on Atari games.

View on arXiv PDF

Similar