AIFeb 23

Recurrent Structural Policy Gradient for Partially Observable Mean Field Games

Clarisse Wibault, Johannes Forkel, Sebastian Towers, Tiphaine Wibault, Juan Duque, George Whittle, Andreas Schaab, Yucheng Yang, Chiyuan Wang, Michael Osborne, Benjamin Moll, Jakob Foerster

arXiv:2602.20141v12.41 citationsh-index: 9Has Code

Originality Highly original

AI Analysis

This work addresses algorithmic limitations in modeling large population interactions with partial observability, offering a novel method for researchers in game theory and economics.

The authors tackled the challenge of scaling Hybrid Structural Methods to Partially Observable Mean Field Games by proposing Recurrent Structural Policy Gradient (RSPG), which achieved state-of-the-art performance, an order-of-magnitude faster convergence, and solved a macroeconomics MFG with heterogeneous agents, common noise, and history-aware policies for the first time.

Mean Field Games (MFGs) provide a principled framework for modeling interactions in large population models: at scale, population dynamics become deterministic, with uncertainty entering only through aggregate shocks, or common noise. However, algorithmic progress has been limited since model-free methods are too high variance and exact methods scale poorly. Recent Hybrid Structural Methods (HSMs) use Monte Carlo rollouts for the common noise in combination with exact estimation of the expected return, conditioned on those samples. However, HSMs have not been scaled to Partially Observable settings. We propose Recurrent Structural Policy Gradient (RSPG), the first history-aware HSM for settings involving public information. We also introduce MFAX, our JAX-based framework for MFGs. By leveraging known transition dynamics, RSPG achieves state-of-the-art performance as well as an order-of-magnitude faster convergence and solves, for the first time, a macroeconomics MFG with heterogeneous agents, common noise and history-aware policies. MFAX is publicly available at: https://github.com/CWibault/mfax.

View on arXiv PDF Code

Similar