LGGTMAOct 23, 2022

A Cooperative Reinforcement Learning Environment for Detecting and Penalizing Betrayal

arXiv:2210.12841v11 citationsh-index: 11
Originality Incremental advance
AI Analysis

This work addresses social interaction challenges in multi-agent systems, though it appears incremental with preliminary results.

The paper tackles the problem of detecting and penalizing betrayal in cooperative reinforcement learning environments, showing that deceptive agents outperform honest baselines and that their betrayal detection method surpasses probabilistic baselines.

In this paper we present a Reinforcement Learning environment that leverages agent cooperation and communication, aimed at detection, learning and ultimately penalizing betrayal patterns that emerge in the behavior of self-interested agents. We provide a description of game rules, along with interesting cases of betrayal and trade-offs that arise. Preliminary experimental investigations illustrate a) betrayal emergence, b) deceptive agents outperforming honest baselines and b) betrayal detection based on classification of behavioral features, which surpasses probabilistic detection baselines. Finally, we propose approaches for penalizing betrayal, list directions for future work and suggest interesting extensions of the environment towards capturing and exploring increasingly complex patterns of social interactions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes