Almost Sure Convergence Rates of Stochastic Approximation and Reinforcement Learning via a Poisson-Moreau Drift

arXiv:2605.0710453.1

Predicted impact top 46% in LG · last 90 daysOriginality Incremental advance

AI Analysis

This provides theoretical convergence guarantees for a broad class of RL algorithms (e.g., Q-learning) under realistic Markovian noise, addressing a fundamental challenge.

The paper establishes almost sure convergence rates for stochastic approximation and reinforcement learning under Markovian noise, achieving rates arbitrarily close to o(n^{1-2η}) for power-law learning rates and o(n^{-1}) for harmonic learning rates, which is near-optimal.

Establishing almost sure convergence rates for stochastic approximation and reinforcement learning under Markovian noise is a fundamental theoretical challenge. We make progress towards this challenge for a class of stochastic approximation algorithms whose expected updates are contractive, a setting that arises in many reinforcement learning algorithms such as $Q$-learning and linear temporal difference learning. Specifically, for a power-law learning rate $O(n^{-η})$ with $η\in (1/2, 1)$, we obtain an almost sure convergence rate arbitrarily close to $o(n^{1 - 2η})$. For a harmonic learning rate $O(n^{-1})$, we obtain an almost sure convergence rate arbitrarily close to $o(n^{-1})$, which we argue is a strong result because it is close to the optimal rate $O(n^{-1}\log\log n)$ given by the law of the iterated logarithm (for a special case of i.i.d. noise). Key to our analysis is a novel Lyapunov drift construction that applies a Poisson-equation based correction for Markovian noise to the well-established Moreau-envelope smoothing for the contractive mapping.

View on arXiv PDF

Similar