Reinforcement Learning for SBM Graphon Games with Re-Sampling
This addresses the limitation of homogeneity assumptions in mean-field games for multi-population scenarios with unknown network structures, representing an incremental improvement in the field.
The paper tackles the problem of applying mean-field approximations to large populations with complex network structures by proposing a Graphon Game with Re-Sampling (GGR-S) model, and it develops a reinforcement learning algorithm that converges to the Nash equilibrium with finite sample guarantees.
The Mean-Field approximation is a tractable approach for studying large population dynamics. However, its assumption on homogeneity and universal connections among all agents limits its applicability in many real-world scenarios. Multi-Population Mean-Field Game (MP-MFG) models have been introduced in the literature to address these limitations. When the underlying Stochastic Block Model is known, we show that a Policy Mirror Ascent algorithm finds the MP-MFG Nash Equilibrium. In more realistic scenarios where the block model is unknown, we propose a re-sampling scheme from a graphon integrated with the finite N-player MP-MFG model. We develop a novel learning framework based on a Graphon Game with Re-Sampling (GGR-S) model, which captures the complex network structures of agents' connections. We analyze GGR-S dynamics and establish the convergence to dynamics of MP-MFG. Leveraging this result, we propose an efficient sample-based N-player Reinforcement Learning algorithm for GGR-S without population manipulation, and provide a rigorous convergence analysis with finite sample guarantee.