LGOCMLMar 22, 2022

Scalable Deep Reinforcement Learning Algorithms for Mean Field Games

arXiv:2203.11973v266 citationsh-index: 51
AI Analysis

This work addresses a scalability bottleneck for researchers and practitioners applying deep reinforcement learning to large-population strategic games, representing an incremental improvement over existing methods.

The paper tackles the challenge of scaling deep reinforcement learning for Mean Field Games by addressing the difficulty of mixing approximated quantities like strategies or q-values with non-linear function approximators such as neural networks. It proposes two methods—one using distillation and another based on regularization—that enable efficient use of deep RL algorithms and outperform state-of-the-art baselines in numerical experiments.

Mean Field Games (MFGs) have been introduced to efficiently approximate games with very large populations of strategic agents. Recently, the question of learning equilibria in MFGs has gained momentum, particularly using model-free reinforcement learning (RL) methods. One limiting factor to further scale up using RL is that existing algorithms to solve MFGs require the mixing of approximated quantities such as strategies or $q$-values. This is far from being trivial in the case of non-linear function approximation that enjoy good generalization properties, e.g. neural networks. We propose two methods to address this shortcoming. The first one learns a mixed strategy from distillation of historical data into a neural network and is applied to the Fictitious Play algorithm. The second one is an online mixing method based on regularization that does not require memorizing historical data or previous estimates. It is used to extend Online Mirror Descent. We demonstrate numerically that these methods efficiently enable the use of Deep RL algorithms to solve various MFGs. In addition, we show that these methods outperform SotA baselines from the literature.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes