GT LG MLJun 18, 2022

Mutation-Driven Follow the Regularized Leader for Last-Iterate Convergence in Zero-Sum Games

Kenshi Abe, Mitsuki Sakamoto, Atsushi Iwasaki

arXiv:2206.09254v18.623 citationsh-index: 19Has Code

Originality Incremental advance

AI Analysis

This addresses the issue of last-iterate convergence in zero-sum games for researchers and practitioners in game theory and optimization, though it is incremental as it builds on existing FTRL variants.

The paper tackles the problem of limit cycling behavior in Follow the Regularized Leader (FTRL) dynamics for two-player zero-sum games by proposing mutant FTRL (M-FTRL), which introduces mutation for action probability perturbation, resulting in strong convergence guarantees to approximate Nash equilibria and faster convergence rates than FTRL and optimistic FTRL under full-information feedback, with clear convergence under bandit feedback.

In this study, we consider a variant of the Follow the Regularized Leader (FTRL) dynamics in two-player zero-sum games. FTRL is guaranteed to converge to a Nash equilibrium when time-averaging the strategies, while a lot of variants suffer from the issue of limit cycling behavior, i.e., lack the last-iterate convergence guarantee. To this end, we propose mutant FTRL (M-FTRL), an algorithm that introduces mutation for the perturbation of action probabilities. We then investigate the continuous-time dynamics of M-FTRL and provide the strong convergence guarantees toward stationary points that approximate Nash equilibria under full-information feedback. Furthermore, our simulation demonstrates that M-FTRL can enjoy faster convergence rates than FTRL and optimistic FTRL under full-information feedback and surprisingly exhibits clear convergence under bandit feedback.

View on arXiv PDF Code

Similar