LGGTMATHSYOct 21, 2020

On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality

arXiv:2010.10901v26 citations
Originality Incremental advance
AI Analysis

This addresses convergence and optimality issues in competitive multi-agent systems, offering a theoretical foundation for asymmetric information scenarios, though it is incremental in extending existing Q-learning frameworks.

The paper tackles the problem of information asymmetry in competitive multi-agent reinforcement learning by showing that when one agent observes the other's actions, it leads to stable learning outcomes and near-optimal Nash equilibrium policies, unlike in independent learner environments.

In this work, we study the system of interacting non-cooperative two Q-learning agents, where one agent has the privilege of observing the other's actions. We show that this information asymmetry can lead to a stable outcome of population learning, which generally does not occur in an environment of general independent learners. The resulting post-learning policies are almost optimal in the underlying game sense, i.e., they form a Nash equilibrium. Furthermore, we propose in this work a Q-learning algorithm, requiring predictive observation of two subsequent opponent's actions, yielding an optimal strategy given that the latter applies a stationary strategy, and discuss the existence of the Nash equilibrium in the underlying information asymmetrical game.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes