MLLGPRSTJan 30, 2021

On the Stability of Random Matrix Product with Markovian Noise: Application to Linear Stochastic Approximation and TD Learning

arXiv:2102.00185v124 citations
Originality Incremental advance
AI Analysis

This work addresses stability issues in stochastic algorithms for machine learning, such as reinforcement learning, but is incremental as it builds on prior results by relaxing conditions.

The paper tackles the problem of analyzing exponential stability in random matrix products driven by Markov chains, which is crucial for stochastic algorithms in machine learning, by providing a result that relaxes strong existing conditions and applies it to derive finite-time moment bounds for linear stochastic approximation and TD learning algorithms.

This paper studies the exponential stability of random matrix products driven by a general (possibly unbounded) state space Markov chain. It is a cornerstone in the analysis of stochastic algorithms in machine learning (e.g. for parameter tracking in online learning or reinforcement learning). The existing results impose strong conditions such as uniform boundedness of the matrix-valued functions and uniform ergodicity of the Markov chains. Our main contribution is an exponential stability result for the $p$-th moment of random matrix product, provided that (i) the underlying Markov chain satisfies a super-Lyapunov drift condition, (ii) the growth of the matrix-valued functions is controlled by an appropriately defined function (related to the drift condition). Using this result, we give finite-time $p$-th moment bounds for constant and decreasing stepsize linear stochastic approximation schemes with Markovian noise on general state space. We illustrate these findings for linear value-function estimation in reinforcement learning. We provide finite-time $p$-th moment bound for various members of temporal difference (TD) family of algorithms.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes