CLAIFeb 21, 2025

Does Reasoning Introduce Bias? A Study of Social Bias Evaluation and Mitigation in LLM Reasoning

arXiv:2502.15361v319 citationsh-index: 6Has CodeEMNLP
Originality Incremental advance
AI Analysis

This addresses bias in reasoning for LLM users, but it is incremental as it builds on existing bias evaluation methods.

The study systematically evaluated social bias in LLM-generated reasoning, finding that biased reasoning steps correlate with incorrect predictions and stereotype expression. It proposed ADBP, a lightweight mitigation method that outperformed a baseline in reducing bias and improving accuracy.

Recent advances in large language models (LLMs) have enabled automatic generation of chain-of-thought (CoT) reasoning, leading to strong performance on tasks such as math and code. However, when reasoning steps reflect social stereotypes (e.g., those related to gender, race or age), they can reinforce harmful associations and lead to misleading conclusions. We present the first systematic evaluation of social bias within LLM-generated reasoning, focusing on reasoning language models (e.g., DeepSeek-R1, OpenAI o1) that natively produce reasoning chains as part of their answers. Using the BBQ dataset, we analyze both prediction accuracy and reasoning bias across a broad spectrum of models, including instruction-tuned and CoT-augmented variants of DeepSeek-R1 (8B/32B), ChatGPT, and other open-source LLMs. We quantify how biased reasoning steps correlate with incorrect predictions and often lead to stereotype expression. To mitigate reasoning-induced bias, we propose Answer Distribution as Bias Proxy (ADBP), a lightweight mitigation method that detects bias by tracking how model predictions change across incremental reasoning steps. ADBP outperforms Stereotype-free Reasoning Pattern (SfRP) baseline in most cases, mitigating bias and improving the accuracy of LLM outputs. Evaluation and mitigation code is available at https://github.com/elviswxy/LLM_reasoning_bias.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes