A General Framework for Learning Mean-Field Games
This work addresses the challenge of scalable and stable learning in multi-agent systems for applications like equilibrium product pricing, representing a novel method for a known bottleneck rather than an incremental improvement.
The paper tackles the problem of learning and decision-making in stochastic games with large populations by proposing a general mean-field game (GMFG) framework, establishing a unique Nash Equilibrium and developing stable reinforcement learning algorithms (GMF-V and GMF-P) that outperform existing multi-agent methods in convergence speed, accuracy, and stability.
This paper presents a general mean-field game (GMFG) framework for simultaneous learning and decision-making in stochastic games with a large population. It first establishes the existence of a unique Nash Equilibrium to this GMFG, and demonstrates that naively combining reinforcement learning with the fixed-point approach in classical MFGs yields unstable algorithms. It then proposes value-based and policy-based reinforcement learning algorithms (GMF-V and GMF-P, respectively) with smoothed policies, with analysis of their convergence properties and computational complexities. Experiments on an equilibrium product pricing problem demonstrate that GMF-V-Q and GMF-P-TRPO, two specific instantiations of GMF-V and GMF-P, respectively, with Q-learning and TRPO, are both efficient and robust in the GMFG setting. Moreover, their performance is superior in convergence speed, accuracy, and stability when compared with existing algorithms for multi-agent reinforcement learning in the $N$-player setting.