LGMLOct 21, 2019

Momentum in Reinforcement Learning

arXiv:1910.09322v236 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the challenge of stabilizing and accelerating convergence in reinforcement learning for practitioners, but it appears incremental as it builds on existing methods like Value Iteration and DQN.

The authors tackled the problem of improving reinforcement learning algorithms by adapting the optimization concept of momentum to value iteration, resulting in Momentum Value Iteration (MoVI) and a DQN-based improvement that was tested on Atari games, though no concrete performance numbers were provided.

We adapt the optimization's concept of momentum to reinforcement learning. Seeing the state-action value functions as an analog to the gradients in optimization, we interpret momentum as an average of consecutive $q$-functions. We derive Momentum Value Iteration (MoVI), a variation of Value Iteration that incorporates this momentum idea. Our analysis shows that this allows MoVI to average errors over successive iterations. We show that the proposed approach can be readily extended to deep learning. Specifically, we propose a simple improvement on DQN based on MoVI, and experiment it on Atari games.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes