LG AI MAJul 8, 2022

Interaction Pattern Disentangling for Multi-Agent Reinforcement Learning

Shunyu Liu, Jie Song, Yihe Zhou, Na Yu, Kaixuan Chen, Zunlei Feng, Mingli Song

arXiv:2207.03902v411.117 citationsh-index: 25Has Code

Originality Incremental advance

AI Analysis

This addresses the issue of over-fitting on noisy interactions in multi-agent systems, which is an incremental improvement for researchers and practitioners in cooperative AI.

The paper tackles the problem of noisy entity interactions in multi-agent reinforcement learning by introducing a method to disentangle interactions into prototypes, which improves generalizability and interpretability, achieving superior results on single-task, multi-task, and zero-shot benchmarks compared to state-of-the-art methods.

Deep cooperative multi-agent reinforcement learning has demonstrated its remarkable success over a wide spectrum of complex control tasks. However, recent advances in multi-agent learning mainly focus on value decomposition while leaving entity interactions still intertwined, which easily leads to over-fitting on noisy interactions between entities. In this work, we introduce a novel interactiOn Pattern disenTangling (OPT) method, to disentangle the entity interactions into interaction prototypes, each of which represents an underlying interaction pattern within a subgroup of the entities. OPT facilitates filtering the noisy interactions between irrelevant entities and thus significantly improves generalizability as well as interpretability. Specifically, OPT introduces a sparse disagreement mechanism to encourage sparsity and diversity among discovered interaction prototypes. Then the model selectively restructures these prototypes into a compact interaction pattern by an aggregator with learnable weights. To alleviate the training instability issue caused by partial observability, we propose to maximize the mutual information between the aggregation weights and the history behaviors of each agent. Experiments on single-task, multi-task and zero-shot benchmarks demonstrate that the proposed method yields results superior to the state-of-the-art counterparts. Our code is available at https://github.com/liushunyu/OPT.

View on arXiv PDF Code

Similar