Stability and Generalization for Decentralized Markov SGD
It extends stability-based generalization theory to decentralized and minimax settings with Markovian data, addressing a gap for practitioners using decentralized learning with dependent data.
This work provides non-asymptotic generalization bounds for decentralized SGD and SGDA under Markov chain sampling, capturing the joint effects of network topology, mixing properties, and primal-dual dynamics.
Stochastic gradient methods are central to large-scale learning, yet their generalization theory typically relies on independent sampling assumptions. In many practical applications, data are generated by Markov chains and learning is performed in a decentralized manner, which introduces significant analytical challenges. In this work, we investigate the stability and generalization of decentralized stochastic gradient descent (SGD) and stochastic gradient descent ascent (SGDA) under Markov chain sampling. Leveraging a stability-based framework, we characterize how Markovian dependence and decentralized communication jointly influence generalization behavior. Our analysis captures the effects of network topology, Markov chain mixing properties, and primal-dual dynamics. We establish non-asymptotic generalization bounds for both algorithms, extending existing results on Markov stochastic gradient methods to decentralized and minimax settings.