47.5AIJun 2
Think-Before-Speak: From Internal Evaluation to Public Expression in Multi-Agent Social SimulationKaiqi Yang, Tai-Quan Peng, Sanguk Lee et al.
LLM-based multi-agent simulation offers a promising way to study social interaction, deliberation, and collective opinion dynamics. However, many existing dialogue simulation frameworks represent interaction mainly as observable turn exchange or aggregated outputs, leaving the internal evaluative processes behind silence, speaking intention, and public expression difficult to examine. We introduce TBS (Think-Before-Speak), an interval-based multi-agent simulation framework that separates agents' private reasoning from public utterance generation. At each interval, all agents update structured internal states based on the shared dialogue history and their own memory. These states include dissonance-related appraisal, perceived opinion climate, perceived isolation risk, response strategy, and willingness to speak. The orchestrator then resolves competing speaking intentions and commits one utterance to the public dialogue, allowing internal evaluation and public interaction to co-evolve over time. We evaluate TBS in simulated town hall discussions on a climate-related policy issue. Results show that TBS produces coherent internal-state traces and that these traces vary systematically across turn-allocation, silence, and memory conditions. Dissonance-related appraisal increases agents' willingness to speak, whereas silence-pressure appraisal decreases it. Once speaking intention is formed, public expression is shaped mainly by turn-allocation rules. These findings suggest that TBS supports mechanism-sensitive social simulation by making the pathway from internal evaluation to public expression observable and analyzable.
AIOct 20, 2024
Exploring Social Desirability Response Bias in Large Language Models: Evidence from GPT-4 SimulationsSanguk Lee, Kai-Qi Yang, Tai-Quan Peng et al.
Large language models (LLMs) are employed to simulate human-like responses in social surveys, yet it remains unclear if they develop biases like social desirability response (SDR) bias. To investigate this, GPT-4 was assigned personas from four societies, using data from the 2022 Gallup World Poll. These synthetic samples were then prompted with or without a commitment statement intended to induce SDR. The results were mixed. While the commitment statement increased SDR index scores, suggesting SDR bias, it reduced civic engagement scores, indicating an opposite trend. Additional findings revealed demographic associations with SDR scores and showed that the commitment statement had limited impact on GPT-4's predictive performance. The study underscores potential avenues for using LLMs to investigate biases in both humans and LLMs themselves.
CYDec 21, 2024
Beyond Partisan Leaning: A Comparative Analysis of Political Bias in Large Language ModelsTai-Quan Peng, Kaiqi Yang, Sanguk Lee et al.
As large language models (LLMs) become increasingly embedded in civic, educational, and political information environments, concerns about their potential political bias have grown. Prior research often evaluates such bias through simulated personas or predefined ideological typologies, which may introduce artificial framing effects or overlook how models behave in general use scenarios. This study adopts a persona-free, topic-specific approach to evaluate political behavior in LLMs, reflecting how users typically interact with these systems-without ideological role-play or conditioning. We introduce a two-dimensional framework: one axis captures partisan orientation on highly polarized topics (e.g., abortion, immigration), and the other assesses sociopolitical engagement on less polarized issues (e.g., climate change, foreign policy). Using survey-style prompts drawn from the ANES and Pew Research Center, we analyze responses from 43 LLMs developed in the U.S., Europe, China, and the Middle East. We propose an entropy-weighted bias score to quantify both the direction and consistency of partisan alignment, and identify four behavioral clusters through engagement profiles. Findings show most models lean center-left or left ideologically and vary in their nonpartisan engagement patterns. Model scale and openness are not strong predictors of behavior, suggesting that alignment strategy and institutional context play a more decisive role in shaping political expression.
CLFeb 2, 2025
Embracing Dialectic Intersubjectivity: Coordination of Different Perspectives in Content Analysis with LLM Persona SimulationTaewoo Kang, Kjerstin Thorson, Tai-Quan Peng et al.
This study attempts to advancing content analysis methodology from consensus-oriented to coordination-oriented practices, thereby embracing diverse coding outputs and exploring the dynamics among differential perspectives. As an exploratory investigation of this approach, we evaluate six GPT-4o configurations to analyze sentiment in Fox News and MSNBC transcripts on Biden and Trump during the 2020 U.S. presidential campaign, examining patterns across these models. By assessing each model's alignment with ideological perspectives, we explore how partisan selective processing could be identified in LLM-Assisted Content Analysis (LACA). Findings reveal that partisan persona LLMs exhibit stronger ideological biases when processing politically congruent content. Additionally, intercoder reliability is higher among same-partisan personas compared to cross-partisan pairs. This approach enhances the nuanced understanding of LLM outputs and advances the integrity of AI-driven social science research, enabling simulations of real-world implications.