PAC Guarantees for Cooperative Multi-Agent Reinforcement Learning with Restricted Communication
This work addresses communication constraints in multi-agent systems, which is a domain-specific problem for AI and robotics, but it is incremental as it extends prior work on non-interacting agents.
The paper tackles the problem of providing PAC guarantees for cooperative multi-agent reinforcement learning with noisy and resource-limited communication, resulting in improved sample complexity bounds and a theoretically motivated algorithm that outperforms naive information fusion methods.
We develop model free PAC performance guarantees for multiple concurrent MDPs, extending recent works where a single learner interacts with multiple non-interacting agents in a noise free environment. Our framework allows noisy and resource limited communication between agents, and develops novel PAC guarantees in this extended setting. By allowing communication between the agents themselves, we suggest improved PAC-exploration algorithms that can overcome the communication noise and lead to improved sample complexity bounds. We provide a theoretically motivated algorithm that optimally combines information from the resource limited agents, thereby analyzing the interaction between noise and communication constraints that are ubiquitous in real-world systems. We present empirical results for a simple task that supports our theoretical formulations and improve upon naive information fusion methods.