LG GT MA TH SYOct 21, 2020

On Information Asymmetry in Competitive Multi-Agent Reinforcement Learning: Convergence and Optimality

Ezra Tampubolon, Haris Ceribasic, Holger Boche

arXiv:2010.10901v22.36 citations

Originality Incremental advance

AI Analysis

This addresses convergence and optimality issues in competitive multi-agent systems, offering a theoretical foundation for asymmetric information scenarios, though it is incremental in extending existing Q-learning frameworks.

The paper tackles the problem of information asymmetry in competitive multi-agent reinforcement learning by showing that when one agent observes the other's actions, it leads to stable learning outcomes and near-optimal Nash equilibrium policies, unlike in independent learner environments.

In this work, we study the system of interacting non-cooperative two Q-learning agents, where one agent has the privilege of observing the other's actions. We show that this information asymmetry can lead to a stable outcome of population learning, which generally does not occur in an environment of general independent learners. The resulting post-learning policies are almost optimal in the underlying game sense, i.e., they form a Nash equilibrium. Furthermore, we propose in this work a Q-learning algorithm, requiring predictive observation of two subsequent opponent's actions, yielding an optimal strategy given that the latter applies a stationary strategy, and discuss the existence of the Nash equilibrium in the underlying information asymmetrical game.

View on arXiv PDF

Similar