AISep 16, 2020

Theory of Mind with Guilt Aversion Facilitates Cooperative Reinforcement Learning

arXiv:2009.07445v112 citations
Originality Incremental advance
AI Analysis

This addresses cooperative behavior in multi-agent reinforcement learning for social dilemmas, representing an incremental improvement by integrating psychological concepts into existing frameworks.

The paper tackled the problem of suboptimal policies in social dilemmas like Stag Hunt by introducing Theory of Mind Agents with Guilt Aversion (ToMAGA), which use belief-based guilt aversion as a reward shaping mechanism, and showed that these agents efficiently learn cooperative behaviors.

Guilt aversion induces experience of a utility loss in people if they believe they have disappointed others, and this promotes cooperative behaviour in human. In psychological game theory, guilt aversion necessitates modelling of agents that have theory about what other agents think, also known as Theory of Mind (ToM). We aim to build a new kind of affective reinforcement learning agents, called Theory of Mind Agents with Guilt Aversion (ToMAGA), which are equipped with an ability to think about the wellbeing of others instead of just self-interest. To validate the agent design, we use a general-sum game known as Stag Hunt as a test bed. As standard reinforcement learning agents could learn suboptimal policies in social dilemmas like Stag Hunt, we propose to use belief-based guilt aversion as a reward shaping mechanism. We show that our belief-based guilt averse agents can efficiently learn cooperative behaviours in Stag Hunt Games.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes