MAAICLOct 13, 2023

Welfare Diplomacy: Benchmarking Language Model Cooperation

arXiv:2310.08901v144 citationsh-index: 13Has Code
Originality Incremental advance
AI Analysis

This work addresses the need for robust benchmarks to measure cooperation in multi-agent AI systems, which is crucial for societal safety, though it is incremental as it adapts an existing game.

The paper tackles the problem of benchmarking cooperative capabilities in AI systems by introducing Welfare Diplomacy, a general-sum variant of the board game Diplomacy, and finds that baseline agents using state-of-the-art language models achieve high social welfare but are exploitable.

The growing capabilities and increasingly widespread deployment of AI systems necessitate robust benchmarks for measuring their cooperative capabilities. Unfortunately, most multi-agent benchmarks are either zero-sum or purely cooperative, providing limited opportunities for such measurements. We introduce a general-sum variant of the zero-sum board game Diplomacy -- called Welfare Diplomacy -- in which players must balance investing in military conquest and domestic welfare. We argue that Welfare Diplomacy facilitates both a clearer assessment of and stronger training incentives for cooperative capabilities. Our contributions are: (1) proposing the Welfare Diplomacy rules and implementing them via an open-source Diplomacy engine; (2) constructing baseline agents using zero-shot prompted language models; and (3) conducting experiments where we find that baselines using state-of-the-art models attain high social welfare but are exploitable. Our work aims to promote societal safety by aiding researchers in developing and assessing multi-agent AI systems. Code to evaluate Welfare Diplomacy and reproduce our experiments is available at https://github.com/mukobi/welfare-diplomacy.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes