APNov 7, 2016
Mean Field Type Control with Congestion (II): An Augmented Lagrangian MethodYves Achdou, Mathieu Lauriere
This work deals with a numerical method for solving a mean-field type control problem with congestion. It is the continuation of an article by the same authors, in which suitably defined weak solutions of the system of partial differential equations arising from the model were discussed and existence and uniqueness were proved. Here, the focus is put on numerical methods: a monotone finite difference scheme is proposed and shown to have a variational interpretation. Then an Alternating Direction Method of Multipliers for solving the variational problem is addressed. It is based on an augmented Lagrangian. Two kinds of boundary conditions are considered: periodic conditions and more realistic boundary conditions associated to state constrained problems. Various test cases and numerical results are presented.
GTMar 6, 2024
Population-aware Online Mirror Descent for Mean-Field Games by Deep Reinforcement LearningZida Wu, Mathieu Lauriere, Samuel Jia Cong Chua et al.
Mean Field Games (MFGs) have the ability to handle large-scale multi-agent systems, but learning Nash equilibria in MFGs remains a challenging task. In this paper, we propose a deep reinforcement learning (DRL) algorithm that achieves population-dependent Nash equilibrium without the need for averaging or sampling from history, inspired by Munchausen RL and Online Mirror Descent. Through the design of an additional inner-loop replay buffer, the agents can effectively learn to achieve Nash equilibrium from any distribution, mitigating catastrophic forgetting. The resulting policy can be applied to various initial distributions. Numerical experiments on four canonical examples demonstrate our algorithm has better convergence properties than SOTA algorithms, in particular a DRL version of Fictitious Play for population-dependent policies.
LGSep 3, 2025
Population-aware Online Mirror Descent for Mean-Field Games with Common Noise by Deep Reinforcement LearningZida Wu, Mathieu Lauriere, Matthieu Geist et al.
Mean Field Games (MFGs) offer a powerful framework for studying large-scale multi-agent systems. Yet, learning Nash equilibria in MFGs remains a challenging problem, particularly when the initial distribution is unknown or when the population is subject to common noise. In this paper, we introduce an efficient deep reinforcement learning (DRL) algorithm designed to achieve population-dependent Nash equilibria without relying on averaging or historical sampling, inspired by Munchausen RL and Online Mirror Descent. The resulting policy is adaptable to various initial distributions and sources of common noise. Through numerical experiments on seven canonical examples, we demonstrate that our algorithm exhibits superior convergence properties compared to state-of-the-art algorithms, particularly a DRL version of Fictitious Play for population-dependent policies. The performance in the presence of common noise underscores the robustness and adaptability of our approach.
OCJun 25, 2021
Reinforcement Learning for Mean Field Games, with Applications to EconomicsAndrea Angiuli, Jean-Pierre Fouque, Mathieu Lauriere
Mean field games (MFG) and mean field control problems (MFC) are frameworks to study Nash equilibria or social optima in games with a continuum of agents. These problems can be used to approximate competitive or cooperative games with a large finite number of agents and have found a broad range of applications, in particular in economics. In recent years, the question of learning in MFG and MFC has garnered interest, both as a way to compute solutions and as a way to model how large populations of learners converge to an equilibrium. Of particular interest is the setting where the agents do not know the model, which leads to the development of reinforcement learning (RL) methods. After reviewing the literature on this topic, we present a two timescale approach with RL for MFG and MFC, which relies on a unified Q-learning algorithm. The main novelty of this method is to simultaneously update an action-value function and a distribution but with different rates, in a model-free fashion. Depending on the ratio of the two learning rates, the algorithm learns either the MFG or the MFC solution. To illustrate this method, we apply it to a mean field problem of accumulated consumption in finite horizon with HARA utility function, and to a trader's optimal liquidation problem.