The Evolutionary Dynamics of Independent Learning Agents in Population Games
This work addresses a foundational open problem in multi-agent systems for researchers, providing a novel analytical tool for population games, though it is incremental in extending prior 2-player game analyses.
The paper tackles the problem of understanding evolutionary dynamics in multi-agent reinforcement learning by extending analysis from 2-player games to population games, resulting in a unified framework that characterizes dynamics via a partial differential equation and validates it experimentally across various learning methods.
Understanding the evolutionary dynamics of reinforcement learning under multi-agent settings has long remained an open problem. While previous works primarily focus on 2-player games, we consider population games, which model the strategic interactions of a large population comprising small and anonymous agents. This paper presents a formal relation between stochastic processes and the dynamics of independent learning agents who reason based on the reward signals. Using a master equation approach, we provide a novel unified framework for characterising population dynamics via a single partial differential equation (Theorem 1). Through a case study involving Cross learning agents, we illustrate that Theorem 1 allows us to identify qualitatively different evolutionary dynamics, to analyse steady states, and to gain insights into the expected behaviour of a population. In addition, we present extensive experimental results validating that Theorem 1 holds for a variety of learning methods and population games.