GT MLDec 3, 2020

On the Impossibility of Convergence of Mixed Strategies with No Regret Learning

Vidya Muthukumar, Soham Phade, Anant Sahai

arXiv:2012.02125v37.35 citations

Originality Highly original

AI Analysis

This work identifies a fundamental limitation for game theory researchers and practitioners using no-regret learning, showing that even optimal algorithms fail to converge to Nash equilibria in certain settings.

This paper investigates the convergence of mixed strategies derived from optimal no-regret learning in 2x2 competitive games. It demonstrates that for mean-based and monotonic no-regret algorithms, the limiting mixed strategies cannot converge almost surely to any Nash equilibrium.

We study the limiting behavior of the mixed strategies that result from optimal no-regret learning strategies in a repeated game setting where the stage game is any 2 by 2 competitive game. We consider optimal no-regret algorithms that are mean-based and monotonic in their argument. We show that for any such algorithm, the limiting mixed strategies of the players cannot converge almost surely to any Nash equilibrium. This negative result is also shown to hold under a broad relaxation of these assumptions, including popular variants of Online-Mirror-Descent with optimism and/or adaptive step-sizes. Finally, we conjecture that the monotonicity assumption can be removed, and provide partial evidence for this conjecture. Our results identify the inherent stochasticity in players' realizations as a critical factor underlying this divergence in outcomes between using the opponent's mixtures and realizations to make updates.

View on arXiv PDF

Similar