Learning Roles with Emergent Social Value Orientations
This addresses cooperation challenges in multi-agent systems, offering a novel approach to role learning, though it appears incremental in building on existing SVO concepts.
The paper tackles intertemporal social dilemmas in multi-agent reinforcement learning by introducing a framework that learns roles through emergent social value orientations, achieving stable division of labor and cooperation in tasks of varying complexity.
Social dilemmas can be considered situations where individual rationality leads to collective irrationality. The multi-agent reinforcement learning community has leveraged ideas from social science, such as social value orientations (SVO), to solve social dilemmas in complex cooperative tasks. In this paper, by first introducing the typical "division of labor or roles" mechanism in human society, we provide a promising solution for intertemporal social dilemmas (ISD) with SVOs. A novel learning framework, called Learning Roles with Emergent SVOs (RESVO), is proposed to transform the learning of roles into the social value orientation emergence, which is symmetrically solved by endowing agents with altruism to share rewards with other agents. An SVO-based role embedding space is then constructed by individual conditioning policies on roles with a novel rank regularizer and mutual information maximizer. Experiments show that RESVO achieves a stable division of labor and cooperation in ISDs with different complexity.