PR LG SY OCJan 28, 2024

Rates of Convergence in the Central Limit Theorem for Markov Chains, with an Application to TD Learning

arXiv:2401.15719v310.819 citationsh-index: 3Math Oper Res

Originality Synthesis-oriented

AI Analysis

This work provides foundational theoretical guarantees for TD learning algorithms in reinforcement learning, which is incremental as it extends existing central limit theorem results to a specific application.

The authors tackled the problem of establishing non-asymptotic convergence rates in the central limit theorem for Markov chains, specifically applying it to Temporal Difference (TD) learning with averaging, resulting in a theoretical framework for analyzing TD learning convergence.

We prove a non-asymptotic central limit theorem for vector-valued martingale differences using Stein's method, and use Poisson's equation to extend the result to functions of Markov Chains. We then show that these results can be applied to establish a non-asymptotic central limit theorem for Temporal Difference (TD) learning with averaging.

View on arXiv PDF

Similar