LGMLJul 9, 2020

The Mean-Squared Error of Double Q-Learning

arXiv:2007.05034v319 citations
Originality Synthesis-oriented
AI Analysis

This provides incremental theoretical insight for reinforcement learning researchers by clarifying error equivalence between these algorithms.

The paper theoretically compares the asymptotic mean-squared error of Double Q-learning and Q-learning, showing they are equal under specific conditions like using twice the learning rate and averaging estimators, with simulations to illustrate practical implications.

In this paper, we establish a theoretical comparison between the asymptotic mean-squared error of Double Q-learning and Q-learning. Our result builds upon an analysis for linear stochastic approximation based on Lyapunov equations and applies to both tabular setting and with linear function approximation, provided that the optimal policy is unique and the algorithms converge. We show that the asymptotic mean-squared error of Double Q-learning is exactly equal to that of Q-learning if Double Q-learning uses twice the learning rate of Q-learning and outputs the average of its two estimators. We also present some practical implications of this theoretical observation using simulations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes