Evolutionary Game Theory Squared: Evolving Agents in Endogenously Evolving Zero-Sum Games
This work provides a foundational theoretical framework for understanding co-evolving competitive systems, which is relevant for researchers in evolutionary game theory and online learning.
This paper introduces a new framework where both agents and zero-sum games co-evolve strategically over time, moving beyond the traditional fixed-game paradigm. Despite the chaotic coevolution, the system exhibits conservation laws, Poincaré recurrence, and convergence of time-average agent behavior and utility to Nash equilibrium values of the time-average game. The authors also provide a polynomial-time algorithm to predict this time-average behavior.
The predominant paradigm in evolutionary game theory and more generally online learning in games is based on a clear distinction between a population of dynamic agents that interact given a fixed, static game. In this paper, we move away from the artificial divide between dynamic agents and static games, to introduce and analyze a large class of competitive settings where both the agents and the games they play evolve strategically over time. We focus on arguably the most archetypal game-theoretic setting -- zero-sum games (as well as network generalizations) -- and the most studied evolutionary learning dynamic -- replicator, the continuous-time analogue of multiplicative weights. Populations of agents compete against each other in a zero-sum competition that itself evolves adversarially to the current population mixture. Remarkably, despite the chaotic coevolution of agents and games, we prove that the system exhibits a number of regularities. First, the system has conservation laws of an information-theoretic flavor that couple the behavior of all agents and games. Secondly, the system is Poincaré recurrent, with effectively all possible initializations of agents and games lying on recurrent orbits that come arbitrarily close to their initial conditions infinitely often. Thirdly, the time-average agent behavior and utility converge to the Nash equilibrium values of the time-average game. Finally, we provide a polynomial time algorithm to efficiently predict this time-average behavior for any such coevolving network game.