LG GTJan 26, 2023

On the Convergence of No-Regret Learning Dynamics in Time-Varying Games

Ioannis Anagnostides, Ioannis Panageas, Gabriele Farina, Tuomas Sandholm

arXiv:2301.11241v315.526 citationsh-index: 81

Originality Incremental advance

AI Analysis

This work addresses a gap in multiagent learning for dynamic settings, offering incremental theoretical advances with implications for meta-learning and regret bounds.

The paper tackles the convergence of no-regret learning algorithms like optimistic gradient descent in time-varying games, providing sharp bounds for equilibrium gaps and improved second-order variation bounds under strong convexity-concavity, with applications to general-sum games and dynamic regret.

Most of the literature on learning in games has focused on the restrictive setting where the underlying repeated game does not change over time. Much less is known about the convergence of no-regret learning algorithms in dynamic multiagent settings. In this paper, we characterize the convergence of optimistic gradient descent (OGD) in time-varying games. Our framework yields sharp convergence bounds for the equilibrium gap of OGD in zero-sum games parameterized on natural variation measures of the sequence of games, subsuming known results for static games. Furthermore, we establish improved second-order variation bounds under strong convexity-concavity, as long as each game is repeated multiple times. Our results also apply to time-varying general-sum multi-player games via a bilinear formulation of correlated equilibria, which has novel implications for meta-learning and for obtaining refined variation-dependent regret bounds, addressing questions left open in prior papers. Finally, we leverage our framework to also provide new insights on dynamic regret guarantees in static games.

View on arXiv PDF

Similar