Sustaining Cooperation in Populations Guided by AI: A Folk Theorem for LLMs

Jonathan Shaki, Eden Hartman, Sarit Kraus, Yonatan Aumann

arXiv:2605.0652558.7

Predicted impact top 6% in GT · last 90 daysOriginality Highly original

AI Analysis

For researchers and practitioners deploying LLMs as advisors in multi-agent systems, this work reveals that shared LLM guidance can fundamentally alter strategic outcomes, enabling cooperation where it would not otherwise exist.

This paper studies how large language models (LLMs) that provide instructions to interacting agents can sustain cooperation even when agents have misaligned incentives. The authors prove a folk theorem showing that all feasible and individually rational outcomes can be sustained as ε-equilibria in repeated settings, despite indirect observation and anonymity of which LLM advised opponents.

Large language models (LLMs) are increasingly used to provide instructions to many agents who interact with one another. Such shared reliance couples agents who appear to act independently: they may in fact be guided by a common model. This coupling can change the prospects for cooperation among agents with misaligned incentives. We study settings in which multiple LLMs each advise a population of clients who participate in instances of an underlying game, creating strategic interaction at the level of the LLMs themselves. This induces a meta-game among the LLMs, mediated through clients. We first analyze the one-shot setting, where shared instructions can change equilibrium behavior only when an LLM may influence more than one role in the same interaction; in such cases, cooperation may emerge, and the effect of client share can be beneficial, harmful, or non-monotone, depending on the base game. Our main result concerns the repeated setting. We prove a folk theorem for LLMs: despite indirect observation and the clients' inability to identify which LLM advised their opponents, all feasible and individually rational outcomes can be sustained as $\varepsilon$-equilibria. The result does not follow from the standard folk theorem and requires new proof techniques. Together, these results show that shared LLM guidance can sustain cooperation among populations of agents even when the underlying incentives are misaligned.

View on arXiv PDF

Similar