CLSep 26, 2025

The Bias is in the Details: An Assessment of Cognitive Bias in LLMs

R. Alexander Knipper, Charles S. Knipper, Kaiqi Zhang, Valerie Sims, Clint Bowers, Santu Karmaker

AI2DeepMindMeta AIMicrosoftTsinghua

arXiv:2509.22856v112.06 citationsh-index: 102

Originality Synthesis-oriented

AI Analysis

It addresses bias in LLMs for real-world decision-making, but is incremental as it evaluates existing biases without proposing new mitigation methods.

This paper assessed cognitive bias in 45 LLMs across eight biases, finding bias-consistent behavior in 17.8-57.3% of instances, with model size reducing bias in 39.5% of cases and prompt detail affecting bias by up to 14.9%.

As Large Language Models (LLMs) are increasingly embedded in real-world decision-making processes, it becomes crucial to examine the extent to which they exhibit cognitive biases. Extensively studied in the field of psychology, cognitive biases appear as systematic distortions commonly observed in human judgments. This paper presents a large-scale evaluation of eight well-established cognitive biases across 45 LLMs, analyzing over 2.8 million LLM responses generated through controlled prompt variations. To achieve this, we introduce a novel evaluation framework based on multiple-choice tasks, hand-curate a dataset of 220 decision scenarios targeting fundamental cognitive biases in collaboration with psychologists, and propose a scalable approach for generating diverse prompts from human-authored scenario templates. Our analysis shows that LLMs exhibit bias-consistent behavior in 17.8-57.3% of instances across a range of judgment and decision-making contexts targeting anchoring, availability, confirmation, framing, interpretation, overattribution, prospect theory, and representativeness biases. We find that both model size and prompt specificity play a significant role on bias susceptibility as follows: larger size (>32B parameters) can reduce bias in 39.5% of cases, while higher prompt detail reduces most biases by up to 14.9%, except in one case (Overattribution), which is exacerbated by up to 8.8%.

View on arXiv PDF

Similar