Emergence of Theory of Mind Collaboration in Multiagent Systems
This addresses the problem of improving collaboration in multiagent systems for researchers and practitioners, though it appears incremental as it builds on existing ToM and POMDP frameworks.
The paper tackles the challenge of integrating Theory of Mind (ToM) into multiagent systems by incorporating it into partially observable Markov decision processes and proposing an adaptive training algorithm, resulting in surpassing previous decentralized execution algorithms in two games.
Currently, in the study of multiagent systems, the intentions of agents are usually ignored. Nonetheless, as pointed out by Theory of Mind (ToM), people regularly reason about other's mental states, including beliefs, goals, and intentions, to obtain performance advantage in competition, cooperation or coalition. However, due to its intrinsic recursion and intractable modeling of distribution over belief, integrating ToM in multiagent planning and decision making is still a challenge. In this paper, we incorporate ToM in multiagent partially observable Markov decision process (POMDP) and propose an adaptive training algorithm to develop effective collaboration between agents with ToM. We evaluate our algorithms with two games, where our algorithm surpasses all previous decentralized execution algorithms without modeling ToM.