LGGTMANov 21, 2022

Value-based CTDE Methods in Symmetric Two-team Markov Game: from Cooperation to Team Competition

arXiv:2211.11886v24 citationsh-index: 43
Originality Synthesis-oriented
AI Analysis

This work addresses team competition in multi-agent reinforcement learning, but it is incremental as it applies existing CTDE methods to a modified environment.

The paper tackled the problem of training a team of agents to compete against multiple opposing strategies in a symmetric two-team Markov game, finding that training against multiple evolving strategies yields the best performance when teams face several strategies.

In this paper, we identify the best learning scenario to train a team of agents to compete against multiple possible strategies of opposing teams. We evaluate cooperative value-based methods in a mixed cooperative-competitive environment. We restrict ourselves to the case of a symmetric, partially observable, two-team Markov game. We selected three training methods based on the centralised training and decentralised execution (CTDE) paradigm: QMIX, MAVEN and QVMix. For each method, we considered three learning scenarios differentiated by the variety of team policies encountered during training. For our experiments, we modified the StarCraft Multi-Agent Challenge environment to create competitive environments where both teams could learn and compete simultaneously. Our results suggest that training against multiple evolving strategies achieves the best results when, for scoring their performances, teams are faced with several strategies.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes