CL AINov 14, 2025

A Multifaceted Analysis of Negative Bias in Large Language Models through the Lens of Parametric Knowledge

arXiv:2511.10881v11 citationsh-index: 7IEEE Transactions on Audio, Speech, and Language Processing

Originality Incremental advance

AI Analysis

This work addresses the problem of negative bias in LLMs for researchers and practitioners, providing insights to mitigate bias, but it is incremental as it builds on prior detection methods.

The paper investigates negative bias in large language models (LLMs), showing that prompt format influences responses more than semantics, and identifies shortcut behaviors where models generate negative responses when lacking knowledge, with context and 'I don't know' options reducing bias while chain-of-thought prompting amplifies it.

Negative bias refers to the tendency of large language models (LLMs) to excessively generate negative responses in binary decision tasks (e.g., yes-no question answering). Previous research has focused on detecting and addressing negative attention heads that induce negative bias. However, the underlying detailed factors influencing negative bias remain underexplored. In this paper, we demonstrate that LLMs exhibit format-level negative bias, meaning the prompt format more influences their responses than the semantics of the negative response. For the fine-grained study of the negative bias, we introduce a pipeline for constructing the evaluation set, which systematically categorizes the dataset into three subsets based on the model's parametric knowledge: correct, incorrect, and insufficient relevant knowledge. Through analysis of this evaluation set, we identify a shortcut behavior in which models tend to generate negative responses when they lack sufficient knowledge to answer a yes-no question, leading to negative bias. We further examine how negative bias changes under various prompting scenarios related to parametric knowledge. We observe that providing relevant context and offering an "I don't know" option generally reduces negative bias, whereas chain-of-thought prompting tends to amplify the bias. Finally, we demonstrate that the degree of negative bias can vary depending on the type of prompt, which influences the direction of the response. Our work reveals the various factors that influence negative bias, providing critical insights for mitigating it in LLMs.

View on arXiv PDF

Similar