Inference-Based Deterministic Messaging For Multi-Agent Communication
This addresses coordination issues in multi-agent systems, offering an incremental improvement to existing decentralized training methods for complex domains.
The paper tackled the problem of suboptimal convergence in decentralized multi-agent communication by proposing a deterministic messaging policy that helps the receiver infer the sender's observation, resulting in convergence to the optimal policy in nearly all runs in matrix-based signaling games and enhanced performance in a partially observable gridworld environment.
Communication is essential for coordination among humans and animals. Therefore, with the introduction of intelligent agents into the world, agent-to-agent and agent-to-human communication becomes necessary. In this paper, we first study learning in matrix-based signaling games to empirically show that decentralized methods can converge to a suboptimal policy. We then propose a modification to the messaging policy, in which the sender deterministically chooses the best message that helps the receiver to infer the sender's observation. Using this modification, we see, empirically, that the agents converge to the optimal policy in nearly all the runs. We then apply this method to a partially observable gridworld environment which requires cooperation between two agents and show that, with appropriate approximation methods, the proposed sender modification can enhance existing decentralized training methods for more complex domains as well.