Discovering Differences in Strategic Behavior Between Humans and LLMs
This addresses the need to assess LLM behavior in social and strategic applications, though it is incremental as it builds on existing behavioral game theory frameworks.
The study tackled the problem of understanding behavioral differences between humans and LLMs in strategic scenarios, revealing that frontier LLMs exhibit deeper strategic behavior than humans in iterated rock-paper-scissors.
As Large Language Models (LLMs) are increasingly deployed in social and strategic scenarios, it becomes critical to understand where and why their behavior diverges from that of humans. While behavioral game theory (BGT) provides a framework for analyzing behavior, existing models do not fully capture the idiosyncratic behavior of humans or black-box, non-human agents like LLMs. We employ AlphaEvolve, a cutting-edge program discovery tool, to directly discover interpretable models of human and LLM behavior from data, thereby enabling open-ended discovery of structural factors driving human and LLM behavior. Our analysis on iterated rock-paper-scissors reveals that frontier LLMs can be capable of deeper strategic behavior than humans. These results provide a foundation for understanding structural differences driving differences in human and LLM behavior in strategic interactions.