From Poincaré Recurrence to Convergence in Imperfect Information Games: Finding Equilibrium via Regularization
This work addresses equilibrium computation for AI agents in complex games like poker, though it appears incremental as it builds on existing regularization techniques.
The paper tackles the problem of finding Nash equilibria in sequential imperfect information games by generalizing Poincaré recurrence results and using reward regularization, leading to algorithms that converge exactly to equilibrium and achieve state-of-the-art performance in zero-sum two-player settings.
In this paper we investigate the Follow the Regularized Leader dynamics in sequential imperfect information games (IIG). We generalize existing results of Poincaré recurrence from normal-form games to zero-sum two-player imperfect information games and other sequential game settings. We then investigate how adapting the reward (by adding a regularization term) of the game can give strong convergence guarantees in monotone games. We continue by showing how this reward adaptation technique can be leveraged to build algorithms that converge exactly to the Nash equilibrium. Finally, we show how these insights can be directly used to build state-of-the-art model-free algorithms for zero-sum two-player Imperfect Information Games (IIG).