CY AIMay 12

Stochastic Parrots or Singing in Harmony? Testing Five Leading LLMs for their Ability to Replicate a Human Survey with Synthetic Data

Jason Miklian, Kristian Hoelscher, John E. Katsos

arXiv:2603.0005950.1h-index: 15

Predicted impact top 40% in CY · last 90 daysOriginality Synthesis-oriented

AI Analysis

For organizational researchers considering using LLMs to generate synthetic survey data, this paper provides evidence that current models replicate conventional wisdom rather than novel insights, highlighting the need for validation protocols.

The study compared human survey responses from 420 Silicon Valley coders with synthetic data from five leading LLMs, finding that while AI-generated responses were technically plausible and harmonized with each other, they failed to capture counterintuitive insights and deviated from real human data, indicating that synthetic data cannot meaningfully replicate human social beliefs in organizational contexts.

How well can AI-derived synthetic research data replicate the responses of human participants? An emerging literature has begun to engage with this question, which carries deep implications for organizational research practice. This article presents a comparison between a human-respondent survey of 420 Silicon Valley coders and developers and synthetic survey data designed to simulate real survey takers generated by five leading Generative AI Large Language Models: ChatGPT Thinking 5 Pro, Claude Sonnet 4.5 Pro plus Claude CoWork 1.123, Gemini Advanced 2.5 Pro, Incredible 1.0, and DeepSeek 3.2. Our findings reveal that while AI agents produced technically plausible results that lean more towards replicability and harmonization than assumed, none were able to capture the counterintuitive insights that made the human survey valuable. Moreover, deviations grouped together for all models, leaving the real data as the outlier. Our key finding is that while leading LLMs are increasingly being used to scale, replicate and replace human survey responses in research, these advances only show an increased capacity to parrot conventional wisdom in harmony with each other rather than revealing novel findings. If synthetic respondents are used in future research, we need more replicable validation protocols and reporting standards for when and where synthetic survey data can be used responsibly, a gap that this paper fills. Our results suggest that synthetic survey responses cannot meaningfully model real human social beliefs within organizations, particularly in contexts lacking previously documented evidence. We conclude that synthetic survey-based research should be cast not as a substitute for rigorous survey methods, but as an increasingly reliable pre- or post-fieldwork instrument for identifying societal assumptions, conventional wisdoms, and other expectations about research populations.

View on arXiv PDF

Similar