AINov 23, 2020

Consolidation via Policy Information Regularization in Deep RL for Multi-Agent Games

Tailia Malloy, Tim Klinger, Miao Liu, Matthew Riemer, Gerald Tesauro, Chris R. Sims

arXiv:2011.11517v17.13 citations

Originality Incremental advance

AI Analysis

This work addresses the problem of non-stationarity in multi-agent reinforcement learning environments by promoting more robust policies, which is relevant for researchers working on multi-agent game AI.

This paper introduces an information-theoretic constraint on learned policy complexity within the MADDPG algorithm for multi-agent games. The approach, Capacity-Limited MADDPG, demonstrates improved learning performance in both cooperative and competitive multi-agent tasks.

This paper introduces an information-theoretic constraint on learned policy complexity in the Multi-Agent Deep Deterministic Policy Gradient (MADDPG) reinforcement learning algorithm. Previous research with a related approach in continuous control experiments suggests that this method favors learning policies that are more robust to changing environment dynamics. The multi-agent game setting naturally requires this type of robustness, as other agents' policies change throughout learning, introducing a nonstationary environment. For this reason, recent methods in continual learning are compared to our approach, termed Capacity-Limited MADDPG. Results from experimentation in multi-agent cooperative and competitive tasks demonstrate that the capacity-limited approach is a good candidate for improving learning performance in these environments.

View on arXiv PDF

Similar