Not Yet: Humans Outperform LLMs in a Colonel Blotto Tournament

arXiv:2605.2209515.8
Predicted impact top 59% in GN · last 90 daysOriginality Incremental advance
AI Analysis

This work provides empirical evidence that current LLMs lack the strategic sophistication of humans in complex, high-dimensional games, highlighting a limitation for economists and AI researchers studying strategic behavior.

In Colonel Blotto tournaments, humans outperformed LLMs by employing better-calibrated intermediate-level allocation heuristics, while LLMs used simpler, more stereotyped strategies. Humans also showed minimal strategy adjustment across different opponent sets, treating LLMs similarly to human competitors.

The emergence of large language models (LLMs) has spurred economists to study how humans and LLMs behave in strategic settings. We organized a series of round-robin tournaments in the Colonel Blotto game. This game attracts game theorists' attention due to high-dimensional action space and the absence of pure strategy Nash equilibria. In the first tournament, more than 200 human participants competed against one another. In the second tournament, several popular LLMs were invited to submit strategies. In the third tournament, we matched the number of LLM strategies to the number submitted by humans. We find that humans more often employ better-calibrated intermediate-level allocation heuristics and outperform the simpler, more stereotyped strategies submitted by LLMs. Strategic sophistication is key to success if and only if the necessary level of reasoning depth is reached, while lower and higher levels of reasoning offer no clear advantage over the primitive strategies. Among humans, field of study weakly predicts success: participants with STEM backgrounds perform better in the first tournament. Surprisingly, humans almost do not adjust their strategies across tournaments with different sets of opponents. This result suggests that humans base their choices primarily on the game's rules rather than on the identity of their opponents, treating LLMs much like human competitors.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes