LG SYJun 27, 2021

Concentration of Contractive Stochastic Approximation and Reinforcement Learning

Siddharth Chandak, Vivek S. Borkar, Parth Dodhia

arXiv:2106.14308v417.531 citations

Originality Incremental advance

AI Analysis

This work provides theoretical guarantees for reinforcement learning algorithms, addressing stability and convergence issues in practical applications.

The paper derived concentration bounds for stochastic approximation algorithms with contractive maps under martingale difference and Markov noises, applying these to reinforcement learning algorithms like asynchronous Q-learning and TD(0).

Using a martingale concentration inequality, concentration bounds `from time $n_0$ on' are derived for stochastic approximation algorithms with contractive maps and both martingale difference and Markov noises. These are applied to reinforcement learning algorithms, in particular to asynchronous Q-learning and TD(0).

View on arXiv PDF

Similar