MAAIJan 2, 2025

PIMAEX: Multi-Agent Exploration through Peer Incentivization

arXiv:2501.01266v1h-index: 27ICAART
Originality Incremental advance
AI Analysis

This addresses the exploration challenge in multi-agent systems, which is important for applications like robotics or game AI, but it appears incremental as it builds on existing intrinsic curiosity and influence-based methods.

The paper tackles the problem of exploration in multi-agent reinforcement learning by proposing a peer-incentivized reward function and communication algorithm, showing that agents using this approach outperform those without it in a deceptive environment.

While exploration in single-agent reinforcement learning has been studied extensively in recent years, considerably less work has focused on its counterpart in multi-agent reinforcement learning. To address this issue, this work proposes a peer-incentivized reward function inspired by previous research on intrinsic curiosity and influence-based rewards. The \textit{PIMAEX} reward, short for Peer-Incentivized Multi-Agent Exploration, aims to improve exploration in the multi-agent setting by encouraging agents to exert influence over each other to increase the likelihood of encountering novel states. We evaluate the \textit{PIMAEX} reward in conjunction with \textit{PIMAEX-Communication}, a multi-agent training algorithm that employs a communication channel for agents to influence one another. The evaluation is conducted in the \textit{Consume/Explore} environment, a partially observable environment with deceptive rewards, specifically designed to challenge the exploration vs.\ exploitation dilemma and the credit-assignment problem. The results empirically demonstrate that agents using the \textit{PIMAEX} reward with \textit{PIMAEX-Communication} outperform those that do not.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes