OC LGFeb 28, 2023

Stochastic Gradient Descent under Markovian Sampling Schemes

arXiv:2302.14428v323.943 citationsh-index: 10

Originality Incremental advance

AI Analysis

This work addresses optimization challenges in decentralized systems, reinforcement learning, and online identification, offering incremental improvements with milder assumptions.

The paper tackles the problem of stochastic gradient descent under Markovian sampling schemes, establishing a theoretical lower bound dependent on the hitting time of the Markov chain and introducing MC-SAG, a variance-reduced method that achieves communication efficiency in token algorithms.

We study a variation of vanilla stochastic gradient descent where the optimizer only has access to a Markovian sampling scheme. These schemes encompass applications that range from decentralized optimization with a random walker (token algorithms), to RL and online system identification problems. We focus on obtaining rates of convergence under the least restrictive assumptions possible on the underlying Markov chain and on the functions optimized. We first unveil the theoretical lower bound for methods that sample stochastic gradients along the path of a Markov chain, making appear a dependency in the hitting time of the underlying Markov chain. We then study Markov chain SGD (MC-SGD) under much milder regularity assumptions than prior works (e.g., no bounded gradients or domain, and infinite state spaces). We finally introduce MC-SAG, an alternative to MC-SGD with variance reduction, that only depends on the hitting time of the Markov chain, therefore obtaining a communication-efficient token algorithm.

View on arXiv PDF

Similar