CLAILGOct 22, 2025

Are Large Language Models Sensitive to the Motives Behind Communication?

arXiv:2510.19687v16 citationsh-index: 16
Originality Incremental advance
AI Analysis

This addresses the problem of LLMs' trustworthiness in real-world communication for AI safety and deployment, but it is incremental as it builds on existing cognitive science models.

The study investigated whether large language models (LLMs) can critically evaluate content by considering the motivations of the source, finding that they successfully discount biased information in controlled experiments but struggle in naturalistic settings like online adverts, though a simple intervention improved performance.

Human communication is motivated: people speak, write, and create content with a particular communicative intent in mind. As a result, information that large language models (LLMs) and AI agents process is inherently framed by humans' intentions and incentives. People are adept at navigating such nuanced information: we routinely identify benevolent or self-serving motives in order to decide what statements to trust. For LLMs to be effective in the real world, they too must critically evaluate content by factoring in the motivations of the source -- for instance, weighing the credibility of claims made in a sales pitch. In this paper, we undertake a comprehensive study of whether LLMs have this capacity for motivational vigilance. We first employ controlled experiments from cognitive science to verify that LLMs' behavior is consistent with rational models of learning from motivated testimony, and find they successfully discount information from biased sources in a human-like manner. We then extend our evaluation to sponsored online adverts, a more naturalistic reflection of LLM agents' information ecosystems. In these settings, we find that LLMs' inferences do not track the rational models' predictions nearly as closely -- partly due to additional information that distracts them from vigilance-relevant considerations. However, a simple steering intervention that boosts the salience of intentions and incentives substantially increases the correspondence between LLMs and the rational model. These results suggest that LLMs possess a basic sensitivity to the motivations of others, but generalizing to novel real-world settings will require further improvements to these models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes