Towards Generalizability of Multi-Agent Reinforcement Learning in Graphs with Recurrent Message Passing
This work addresses the problem of limited generalizability in decentralized multi-agent systems for graph-based applications like communication networks, offering an incremental improvement over existing methods.
The paper tackles the challenge of generalizing multi-agent reinforcement learning across diverse graph-based environments by proposing a recurrent message-passing model that enables continuous information flow, resulting in agents that can adapt to changes in graphs, as demonstrated in routing tasks across 1000 diverse graphs.
Graph-based environments pose unique challenges to multi-agent reinforcement learning. In decentralized approaches, agents operate within a given graph and make decisions based on partial or outdated observations. The size of the observed neighborhood limits the generalizability to different graphs and affects the reactivity of agents, the quality of the selected actions, and the communication overhead. This work focuses on generalizability and resolves the trade-off in observed neighborhood size with a continuous information flow in the whole graph. We propose a recurrent message-passing model that iterates with the environment's steps and allows nodes to create a global representation of the graph by exchanging messages with their neighbors. Agents receive the resulting learned graph observations based on their location in the graph. Our approach can be used in a decentralized manner at runtime and in combination with a reinforcement learning algorithm of choice. We evaluate our method across 1000 diverse graphs in the context of routing in communication networks and find that it enables agents to generalize and adapt to changes in the graph.