Convergence of Actor-Critic Learning for Mean Field Games and Mean Field Control in Continuous Spaces
This provides theoretical guarantees for reinforcement learning methods in large-scale multi-agent systems, though it is incremental as it builds on prior work.
The paper establishes convergence of a deep actor-critic algorithm for solving Mean Field Games and Mean Field Control problems in continuous spaces with infinite horizons, extending results to Mean Field Control Games and validating with numerical experiments on linear-quadratic examples.
We establish the convergence of the deep actor-critic reinforcement learning algorithm presented in [Angiuli et al., 2023a] in the setting of continuous state and action spaces with an infinite discrete-time horizon. This algorithm provides solutions to Mean Field Game (MFG) or Mean Field Control (MFC) problems depending on the ratio between two learning rates: one for the value function and the other for the mean field term. In the MFC case, to rigorously identify the limit, we introduce a discretization of the state and action spaces, following the approach used in the finite-space case in [Angiuli et al., 2023b]. The convergence proofs rely on a generalization of the two-timescale framework introduced in [Borkar, 1997]. We further extend our convergence results to Mean Field Control Games, which involve locally cooperative and globally competitive populations. Finally, we present numerical experiments for linear-quadratic problems in one and two dimensions, for which explicit solutions are available.