LGAIMLAug 25, 2020

Deep Q-Learning: Theoretical Insights from an Asymptotic Analysis

arXiv:2008.10870v22 citations
Originality Incremental advance
AI Analysis

This work addresses a theoretical gap for reinforcement learning practitioners, offering incremental insights into algorithm behavior.

The paper tackles the lack of theoretical guarantees for Deep Q-Learning by providing a convergence analysis under realistic assumptions, proving a result that explains performance inconsistencies and accommodates multiple stationary distributions.

Deep Q-Learning is an important reinforcement learning algorithm, which involves training a deep neural network, called Deep Q-Network (DQN), to approximate the well-known Q-function. Although wildly successful under laboratory conditions, serious gaps between theory and practice as well as a lack of formal guarantees prevent its use in the real world. Adopting a dynamical systems perspective, we provide a theoretical analysis of a popular version of Deep Q-Learning under realistic and verifiable assumptions. More specifically, we prove an important result on the convergence of the algorithm, characterizing the asymptotic behavior of the learning process. Our result sheds light on hitherto unexplained properties of the algorithm and helps understand empirical observations, such as performance inconsistencies even after training. Unlike previous theories, our analysis accommodates state Markov processes with multiple stationary distributions. In spite of the focus on Deep Q-Learning, we believe that our theory may be applied to understand other deep learning algorithms

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes