Graph Exploration for Effective Multi-agent Q-Learning
This addresses the challenge of effective exploration in multi-agent systems for researchers and practitioners, offering a decentralized solution with reduced communication overhead, though it appears incremental by building on existing MARL methods.
The paper tackles the problem of exploration in multi-agent reinforcement learning with graph-based communication by proposing a technique where neighboring agents collaborate to estimate uncertainty, enabling efficient exploration without counting mechanisms and minimal decentralized communication. The result includes theoretical verification for discrete-state scenarios and experimental validation for continuous ones, with agents exchanging only a single parameter vector in continuous cases.
This paper proposes an exploration technique for multi-agent reinforcement learning (MARL) with graph-based communication among agents. We assume the individual rewards received by the agents are independent of the actions by the other agents, while their policies are coupled. In the proposed framework, neighbouring agents collaborate to estimate the uncertainty about the state-action space in order to execute more efficient explorative behaviour. Different from existing works, the proposed algorithm does not require counting mechanisms and can be applied to continuous-state environments without requiring complex conversion techniques. Moreover, the proposed scheme allows agents to communicate in a fully decentralized manner with minimal information exchange. And for continuous-state scenarios, each agent needs to exchange only a single parameter vector. The performance of the algorithm is verified with theoretical results for discrete-state scenarios and with experiments for continuous ones.