GNAICYHCOct 25, 2024

Take Caution in Using LLMs as Human Surrogates: Scylla Ex Machina

arXiv:2410.19599v347 citationsh-index: 7
Originality Incremental advance
AI Analysis

This highlights a critical issue for social scientists and AI researchers, cautioning against the use of LLMs as human simulations due to unpredictable failures, making it an incremental contribution by challenging a recent trend.

The paper tackles the problem of using large language models (LLMs) as surrogates for humans in social science research by assessing their reasoning depth with the 11-20 money request game, finding that nearly all advanced models fail to replicate human behavior distributions.

Recent studies suggest large language models (LLMs) can exhibit human-like reasoning, aligning with human behavior in economic experiments, surveys, and political discourse. This has led many to propose that LLMs can be used as surrogates or simulations for humans in social science research. However, LLMs differ fundamentally from humans, relying on probabilistic patterns, absent the embodied experiences or survival objectives that shape human cognition. We assess the reasoning depth of LLMs using the 11-20 money request game. Nearly all advanced approaches fail to replicate human behavior distributions across many models. Causes of failure are diverse and unpredictable, relating to input language, roles, and safeguarding. These results advise caution when using LLMs to study human behavior or as surrogates or simulations.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes