GT LG MA SYMar 6, 2024

Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement Learning

Zida Wu, Mathieu Lauriere, Samuel Jia Cong Chua, Matthieu Geist, Olivier Pietquin, Ankur Mehta

arXiv:2403.03552v17.311 citationsh-index: 50AAMAS

Originality Incremental advance

AI Analysis

This addresses the problem of handling large-scale multi-agent systems in Mean Field Games, offering an incremental improvement over existing methods.

The paper tackles the challenge of learning Nash equilibria in Mean Field Games by proposing a deep reinforcement learning algorithm that achieves population-dependent Nash equilibrium without averaging or sampling from history, demonstrating better convergence properties than state-of-the-art algorithms in numerical experiments on four canonical examples.

Mean Field Games (MFGs) have the ability to handle large-scale multi-agent systems, but learning Nash equilibria in MFGs remains a challenging task. In this paper, we propose a deep reinforcement learning (DRL) algorithm that achieves population-dependent Nash equilibrium without the need for averaging or sampling from history, inspired by Munchausen RL and Online Mirror Descent. Through the design of an additional inner-loop replay buffer, the agents can effectively learn to achieve Nash equilibrium from any distribution, mitigating catastrophic forgetting. The resulting policy can be applied to various initial distributions. Numerical experiments on four canonical examples demonstrate our algorithm has better convergence properties than SOTA algorithms, in particular a DRL version of Fictitious Play for population-dependent policies.

View on arXiv PDF

Similar