AI LG MAJan 26, 2019

Modelling Bounded Rationality in Multi-Agent Interactions by Generalized Recursive Reasoning

Ying Wen, Yaodong Yang, Rui Luo, Jun Wang

arXiv:1901.09216v222.261 citationsHas Code

Originality Highly original

AI Analysis

This addresses the limitation of assuming perfect rationality in multi-agent reinforcement learning, which is incremental by extending recursive reasoning with hierarchical levels for more realistic agent modeling.

The paper tackles the problem of modeling bounded rationality in multi-agent interactions by introducing a generalized recursive reasoning (GR2) framework that allows agents to have hierarchical levels of rationality, and it demonstrates significant improvements over state-of-the-art baselines on benchmarks like normal-form games and cooperative navigation.

Though limited in real-world decision making, most multi-agent reinforcement learning (MARL) models assume perfectly rational agents -- a property hardly met due to individual's cognitive limitation and/or the tractability of the decision problem. In this paper, we introduce generalized recursive reasoning (GR2) as a novel framework to model agents with different \emph{hierarchical} levels of rationality; our framework enables agents to exhibit varying levels of "thinking" ability thereby allowing higher-level agents to best respond to various less sophisticated learners. We contribute both theoretically and empirically. On the theory side, we devise the hierarchical framework of GR2 through probabilistic graphical models and prove the existence of a perfect Bayesian equilibrium. Within the GR2, we propose a practical actor-critic solver, and demonstrate its convergent property to a stationary point in two-player games through Lyapunov analysis. On the empirical side, we validate our findings on a variety of MARL benchmarks. Precisely, we first illustrate the hierarchical thinking process on the Keynes Beauty Contest, and then demonstrate significant improvements compared to state-of-the-art opponent modeling baselines on the normal-form games and the cooperative navigation benchmark.

View on arXiv PDF Code

Similar