SI CL CYMay 13

When Do LLMs Generate Realistic Social Networks? A Multi-Dimensional Study of Culture, Language, Scale, and Method

Sai Hemanth Kilaru, Sriram Theerdh Manikyala, Raghav Upadhyay, Sri Sai Kumar Ramavath, Srivika Nunavathu, Dalal Alharthi

arXiv:2605.1289874.7

AI Analysis

For researchers using LLMs as social network simulators, this work shows that seemingly minor prompt choices encode substantive sociological assumptions, limiting validity.

LLMs can generate realistic social networks on clustering and modularity, but encode demographic biases above empirical levels; prompt language and architecture shift homophily patterns, with political affiliation dominating under most methods.

Large language models (LLMs) are increasingly used as substitutes for human subjects in behavioral simulations, including synthetic social network generation. Yet it remains unclear how their relational outputs depend on prompt design, cultural framing, prompt language, and model scale. Building on homophily theory and structural balance theory, we formalize four LLM-based tie-formation mechanisms: sequential, global, local, and iterative, and treat them as distinct conditional distributions over edge sets. Using a fixed roster of 50 demographically grounded personas, we generate 192 verified directed networks across four cultural contexts, four prompt languages, three GPT-4.1 variants, and four prompting architectures, with two seeds per condition. We find that cultural framing shifts inbreeding homophily and largest-component connectivity. Political affiliation dominates tie formation under three methods, while the global method substitutes age, showing that prompt architecture functions as a substantive sociological variable. Model scale produces a stable divergence ranking, with the smallest variant behaving qualitatively differently rather than merely noisily. Prompt language alone sharply shifts religion homophily, especially under Hindi prompting, while leaving political homophily nearly invariant. LLM-generated networks match real social graphs on clustering and modularity better than standard graph baselines, yet encode demographic biases above empirical levels. These results show that prompt choices often treated as implementation details encode substantive sociological assumptions.

View on arXiv PDF

Similar