Stubborn: An Environment for Evaluating Stubbornness between Agents with Aligned Incentives
This addresses the lack of research into social dynamics in fully-cooperative multi-agent reinforcement learning, though it appears incremental by focusing on a specific behavioral aspect.
The paper tackles the problem of social dilemmas in fully-cooperative multi-agent settings, where aligned incentives do not guarantee cooperation, by introducing a measure of 'stubbornness' and an environment called Stubborn to evaluate it. In preliminary results, agents learn to use their partner's stubbornness as a signal to improve their choices.
Recent research in multi-agent reinforcement learning (MARL) has shown success in learning social behavior and cooperation. Social dilemmas between agents in mixed-sum settings have been studied extensively, but there is little research into social dilemmas in fullycooperative settings, where agents have no prospect of gaining reward at another agent's expense. While fully-aligned interests are conducive to cooperation between agents, they do not guarantee it. We propose a measure of "stubbornness" between agents that aims to capture the human social behavior from which it takes its name: a disagreement that is gradually escalating and potentially disastrous. We would like to promote research into the tendency of agents to be stubborn, the reactions of counterpart agents, and the resulting social dynamics. In this paper we present Stubborn, an environment for evaluating stubbornness between agents with fully-aligned incentives. In our preliminary results, the agents learn to use their partner's stubbornness as a signal for improving the choices that they make in the environment.