LG GTMay 8, 2025

Reinforcement Learning for Game-Theoretic Resource Allocation on Graphs

arXiv:2505.06319v17.12 citationsh-index: 2IEEE Trans Autom Sci Eng

Originality Incremental advance

AI Analysis

This addresses resource allocation challenges in competitive scenarios like network security or logistics, but it is incremental as it adapts existing RL methods to a specific graph-based problem.

The paper tackled the problem of game-theoretic resource allocation on graphs by modeling it as a multi-step Colonel Blotto Game and applying reinforcement learning methods like DQN and PPO, resulting in RL agents consistently outperforming baseline strategies and achieving a balanced 50% win rate against learned policies.

Game-theoretic resource allocation on graphs (GRAG) involves two players competing over multiple steps to control nodes of interest on a graph, a problem modeled as a multi-step Colonel Blotto Game (MCBG). Finding optimal strategies is challenging due to the dynamic action space and structural constraints imposed by the graph. To address this, we formulate the MCBG as a Markov Decision Process (MDP) and apply Reinforcement Learning (RL) methods, specifically Deep Q-Network (DQN) and Proximal Policy Optimization (PPO). To enforce graph constraints, we introduce an action-displacement adjacency matrix that dynamically generates valid action sets at each step. We evaluate RL performance across a variety of graph structures and initial resource distributions, comparing against random, greedy, and learned RL policies. Experimental results show that both DQN and PPO consistently outperform baseline strategies and converge to a balanced $50\%$ win rate when competing against the learned RL policy. Particularly, on asymmetric graphs, RL agents successfully exploit structural advantages and adapt their allocation strategies, even under disadvantageous initial resource distributions.

View on arXiv PDF

Similar