MAAIJun 30, 2024

Multi-Agent Training for Pommerman: Curriculum Learning and Population-based Self-Play Approach

arXiv:2407.00662v25 citations
Originality Incremental advance
AI Analysis

This addresses challenges in multi-agent reinforcement learning for competitive games like Pommerman, though it is incremental in its approach.

The study tackled training multi-agent systems for the Pommerman game by combining curriculum learning and population-based self-play, resulting in an agent that outperformed top learning agents without requiring communication among allies.

Pommerman is a multi-agent environment that has received considerable attention from researchers in recent years. This environment is an ideal benchmark for multi-agent training, providing a battleground for two teams with communication capabilities among allied agents. Pommerman presents significant challenges for model-free reinforcement learning due to delayed action effects, sparse rewards, and false positives, where opponent players can lose due to their own mistakes. This study introduces a system designed to train multi-agent systems to play Pommerman using a combination of curriculum learning and population-based self-play. We also tackle two challenging problems when deploying the multi-agent training system for competitive games: sparse reward and suitable matchmaking mechanism. Specifically, we propose an adaptive annealing factor based on agents' performance to adjust the dense exploration reward during training dynamically. Additionally, we implement a matchmaking mechanism utilizing the Elo rating system to pair agents effectively. Our experimental results demonstrate that our trained agent can outperform top learning agents without requiring communication among allied agents.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes