GTJun 3

Learning to cooperate with emergent reputation via multi-agent reinforcement learning

Xinwei Song, Yizhe Huang, Dengji Zhao, Xue Feng

arXiv:2606.0435952.4

Predicted impact top 12% in GT · last 90 daysOriginality Incremental advance

AI Analysis

This work addresses the challenge of designing adaptive reputation systems for cooperation in multi-agent systems, offering a fully learned approach that overcomes limitations of predefined rules or intrinsic reward models.

The paper proposes COOPER, a multi-agent reinforcement learning method that jointly learns reputation assessment rules and reputation-based policies from environment rewards, enabling adaptation to various reputation systems and co-players. Experiments on donation and coin games show effective cooperation and emergence of reputation norms across diverse social network topologies.

Reputation, the aggregation of peer assessments diffused through social networks, is a pivotal mechanism for promoting cooperation in social dilemmas ubiquitous to distributed multi-agent systems comprising agents with limited perception and cognitive capabilities. Exploring efficient reputation systems, comprising reputation assessment rules and reputation-based policies, is a long-standing challenge. Previous work assumes predefined reputation assessment rules or models reputation as an intrinsic reward to learn policies, compromising the methods' ability for generalization and adaptation. To address this, we propose a distributed multi-agent reinforcement learning method $\textbf{COOPER}$ ($\textbf{COOP}$eration with $\textbf{E}$mergent $\textbf{R}$eputation), which jointly learns reputation assessment rules and reputation-based policies entirely from environment rewards. Notably, leveraging the underlying mechanisms of reputation, we deliberately design the constituent modules of $\textbf{COOPER}$ and the data flows among them, overcoming the latency and noise in the feedback signal, caused by the deep entanglement between reputation and policy. Experiments on the donation game and the coin game in grid world environments demonstrate that $\textbf{COOPER}$ effectively adapts to various existing reputation systems and co-players. Furthermore, we observe the co-emergence of reputation norms and cooperation in self-play settings. These results hold robustly across diverse social network topologies, underscoring the generalizability and efficacy of our approach.

View on arXiv PDF

Similar