Sander van der Linden

CY
h-index97
5papers
192citations
Novelty58%
AI Score48

5 Papers

CLOct 24, 2023
Generative Language Models Exhibit Social Identity Biases

Tiancheng Hu, Yara Kyrychenko, Steve Rathje et al.

The surge in popularity of large language models has given rise to concerns about biases that these models could learn from humans. We investigate whether ingroup solidarity and outgroup hostility, fundamental social identity biases known from social psychology, are present in 56 large language models. We find that almost all foundational language models and some instruction fine-tuned models exhibit clear ingroup-positive and outgroup-negative associations when prompted to complete sentences (e.g., "We are..."). Our findings suggest that modern language models exhibit fundamental social identity biases to a similar degree as humans, both in the lab and in real-world conversations with LLMs, and that curating training data and instruction fine-tuning can mitigate such biases. Our results have practical implications for creating less biased large-language models and further underscore the need for more research into user interactions with LLMs to prevent potential bias reinforcement in humans.

HCFeb 26
Addressing Climate Action Misperceptions with Generative AI

Miriam Remshard, Yara Kyrychenko, Sander van der Linden et al.

Mitigating climate change requires behaviour change. However, even climate-concerned individuals often hold misperceptions about which actions most reduce carbon emissions. We recruited 1201 climate-concerned individuals to examine whether discussing climate actions with a large language model (LLM) equipped with climate knowledge and prompted to provide personalised responses would foster more accurate perceptions of the impacts of climate actions and increase willingness to adopt feasible, high-impact behaviours. We compared this to having participants run a web search, have a conversation with an unspecialised LLM, and no intervention. The personalised climate LLM was the only condition that led to increased knowledge about the impacts of climate actions and greater intentions to adopt impactful behaviours. While the personalised climate LLM did not outperform a web search in improving understanding of climate action impacts, the ability of LLMs to deliver personalised, actionable guidance may make them more effective at motivating impactful pro-climate behaviour change.

CYFeb 13
How cyborg propaganda reshapes collective action

Jonas R. Kunst, Kinga Bierwiaczonek, Meeyoung Cha et al.

The distinction between genuine grassroots activism and automated influence operations is collapsing. While policy debates focus on bot farms, a distinct threat to democracy is emerging via partisan coordination apps and artificial intelligence-what we term 'cyborg propaganda.' This architecture combines large numbers of verified humans with adaptive algorithmic automation, enabling a closed-loop system. AI tools monitor online sentiment to optimize directives and generate personalized content for users to post online. Cyborg propaganda thereby exploits a critical legal shield: by relying on verified citizens to ratify and disseminate messages, these campaigns operate in a regulatory gray zone, evading liability frameworks designed for automated botnets. We explore the collective action paradox of this technology: does it democratize power by 'unionizing' influence (pooling the reach of dispersed citizens to overcome the algorithmic invisibility of isolated voices), or does it reduce citizens to 'cognitive proxies' of a central directive? We argue that cyborg propaganda fundamentally alters the digital public square, shifting political discourse from a democratic contest of individual ideas to a battle of algorithmic campaigns. We outline a research agenda to distinguish organic from coordinated information diffusion and propose governance frameworks to address the regulatory challenges of AI-assisted collective expression.

CYMay 18, 2025
How Malicious AI Swarms Can Threaten Democracy: The Fusion of Agentic AI and LLMs Marks a New Frontier in Information Warfare

Daniel Thilo Schroeder, Meeyoung Cha, Andrea Baronchelli et al.

Public opinion manipulation has entered a new phase, amplifying its roots in rhetoric and propaganda. Advances in large language models (LLMs) and autonomous agents now let influence campaigns reach unprecedented scale and precision. Researchers warn AI could foster mass manipulation. Generative tools can expand propaganda output without sacrificing credibility and inexpensively create election falsehoods that are rated as more human-like than those written by humans. Techniques meant to refine AI reasoning, such as chain-of-thought prompting, can just as effectively be used to generate more convincing falsehoods. Enabled by these capabilities, another disruptive threat is emerging: swarms of collaborative, malicious AI agents. Fusing LLM reasoning with multi-agent architectures, these systems are capable of coordinating autonomously, infiltrating communities, and fabricating consensus cheaply. By adaptively mimicking human social dynamics, they threaten democracy.

HCMar 5, 2025
Human Preferences for Constructive Interactions in Language Model Alignment

Yara Kyrychenko, Jon Roozenbeek, Brandon Davidson et al.

As large language models (LLMs) enter the mainstream, aligning them to foster constructive dialogue rather than exacerbate societal divisions is critical. Using an individualized and multicultural alignment dataset of over 7,500 conversations of individuals from 74 countries engaging with 21 LLMs, we examined how linguistic attributes linked to constructive interactions are reflected in human preference data used for training AI. We found that users consistently preferred well-reasoned and nuanced responses while rejecting those high in personal storytelling. However, users who believed that AI should reflect their values tended to place less preference on reasoning in LLM responses and more on curiosity. Encouragingly, we observed that users could set the tone for how constructive their conversation would be, as LLMs mirrored linguistic attributes, including toxicity, in user queries.