CLApr 8, 2025

BiasCause: Evaluate Socially Biased Causal Reasoning of Large Language Models

CMU
arXiv:2504.07997v11 citationsh-index: 27
Originality Incremental advance
AI Analysis

This work addresses a critical gap in evaluating socially biased causal reasoning in LLMs, which is important for developers and researchers aiming to mitigate bias in AI systems, though it is incremental as it builds on existing bias detection benchmarks.

The paper tackles the problem of understanding the causal reasoning behind social biases in large language models (LLMs) by proposing a framework to classify and evaluate this reasoning, finding that all tested models produce biased causal graphs in the majority of cases, with 4135 biased instances identified. It also discovers strategies to avoid bias and reveals a tendency for LLMs to confuse correlation with causality, leading to mistaken-biased reasoning.

While large language models (LLMs) already play significant roles in society, research has shown that LLMs still generate content including social bias against certain sensitive groups. While existing benchmarks have effectively identified social biases in LLMs, a critical gap remains in our understanding of the underlying reasoning that leads to these biased outputs. This paper goes one step further to evaluate the causal reasoning process of LLMs when they answer questions eliciting social biases. We first propose a novel conceptual framework to classify the causal reasoning produced by LLMs. Next, we use LLMs to synthesize $1788$ questions covering $8$ sensitive attributes and manually validate them. The questions can test different kinds of causal reasoning by letting LLMs disclose their reasoning process with causal graphs. We then test 4 state-of-the-art LLMs. All models answer the majority of questions with biased causal reasoning, resulting in a total of $4135$ biased causal graphs. Meanwhile, we discover $3$ strategies for LLMs to avoid biased causal reasoning by analyzing the "bias-free" cases. Finally, we reveal that LLMs are also prone to "mistaken-biased" causal reasoning, where they first confuse correlation with causality to infer specific sensitive group names and then incorporate biased causal reasoning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes