OCLGFeb 1, 2020

Finite-Time Analysis of Asynchronous Stochastic Approximation and $Q$-Learning

arXiv:2002.00260v1137 citations
Originality Incremental advance
AI Analysis

This provides theoretical guarantees for asynchronous reinforcement learning algorithms, which are incremental improvements over prior work.

The paper tackles the finite-time convergence analysis of asynchronous stochastic approximation schemes, specifically applying it to asynchronous Q-learning, and proves a bound that matches the best known synchronous results and improves previous asynchronous bounds.

We consider a general asynchronous Stochastic Approximation (SA) scheme featuring a weighted infinity-norm contractive operator, and prove a bound on its finite-time convergence rate on a single trajectory. Additionally, we specialize the result to asynchronous $Q$-learning. The resulting bound matches the sharpest available bound for synchronous $Q$-learning, and improves over previous known bounds for asynchronous $Q$-learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes