GTAIMay 8

Mechanism Design Is Not Enough: Prosocial Agents for Cooperative AI

arXiv:2605.0842698.8
AI Analysis

For AI safety researchers, it demonstrates that ensuring cooperative AI requires not just good rules but also intrinsically prosocial agents.

This paper proves that mechanism design alone cannot maximize social welfare in LLM agent interactions due to incomplete contracts, and shows that prosocial agents can close this welfare gap, achieving superior outcomes in multi-agent resource allocation and social dilemmas.

Ensuring that AI agents behave safely and beneficially when interacting with other parties has emerged as one of the central challenges of modern AI safety. While mechanism design, as the theory of designing rules to align individual and collective objectives, can incentivize cooperative behavior, it is still an open question whether it alone is sufficient to maximize LLM agents' social welfare. This work proves that the answer is negative: drawing from incomplete contract theory, we formally show that when contracts cannot distinguish all relevant future contingencies, there is a strictly positive welfare loss that no realistic mechanism can eliminate. We show that prosocial agents, who weigh others' welfare alongside their own, can close this gap and achieve outcomes that are socially superior and individually beneficial. Experimentally, we show that in multi-agent resource-allocation environments and canonical social dilemmas where agents are powered by large language models, prosociality is beneficial. The implication for AI safety is clear: to enable cooperative interactions at scale, designing adequate mechanisms is not sufficient; agents must be built to be intrinsically prosocial.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes