LGGTMAOCAug 15, 2024

Stochastic Semi-Gradient Descent for Learning Mean Field Games with Population-Aware Function Approximation

MIT
arXiv:2408.08192v29 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the problem of scalable and stable learning in large-population multi-agent systems for researchers in game theory and reinforcement learning, offering a novel method but with incremental improvements over existing approaches.

The paper tackles the inefficiency and instability of traditional fixed-point iteration methods for learning mean field games (MFGs) by proposing a unified parameter approach, resulting in a stochastic gradient descent-type method (SemiSGD) with population-aware linear function approximation that achieves convergence to equilibrium for linear MFGs and to a neighborhood for more practical conditions, validated in six experiments.

Mean field games (MFGs) model interactions in large-population multi-agent systems through population distributions. Traditional learning methods for MFGs are based on fixed-point iteration (FPI), where policy updates and induced population distributions are computed separately and sequentially. However, FPI-type methods may suffer from inefficiency and instability due to potential oscillations caused by this forward-backward procedure. In this work, we propose a novel perspective that treats the policy and population as a unified parameter controlling the game dynamics. By applying stochastic parameter approximation to this unified parameter, we develop SemiSGD, a simple stochastic gradient descent (SGD)-type method, where an agent updates its policy and population estimates simultaneously and fully asynchronously. Building on this perspective, we further apply linear function approximation (LFA) to the unified parameter, resulting in the first population-aware LFA (PA-LFA) for learning MFGs on continuous state-action spaces. A comprehensive finite-time convergence analysis is provided for SemiSGD with PA-LFA, including its convergence to the equilibrium for linear MFGs -- a class of MFGs with a linear structure concerning the population -- under the standard contractivity condition, and to a neighborhood of the equilibrium under a more practical condition. We also characterize the approximation error for non-linear MFGs. We validate our theoretical findings with six experiments on three MFGs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes