CLAIHCDec 4, 2024

CBEval: A framework for evaluating and interpreting cognitive biases in LLMs

arXiv:2412.03605v115 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses concerns about biased reasoning and decision-making in LLMs, which is important for AI safety and fairness, though it appears incremental as it builds on existing bias evaluation methods.

The authors tackled the problem of cognitive biases in Large Language Models (LLMs) by developing CBEval, a framework to interpret and understand these biases, revealing reasoning limitations and specific biases like round number bias through influence graphs.

Rapid advancements in Large Language models (LLMs) has significantly enhanced their reasoning capabilities. Despite improved performance on benchmarks, LLMs exhibit notable gaps in their cognitive processes. Additionally, as reflections of human-generated data, these models have the potential to inherit cognitive biases, raising concerns about their reasoning and decision making capabilities. In this paper we present a framework to interpret, understand and provide insights into a host of cognitive biases in LLMs. Conducting our research on frontier language models we're able to elucidate reasoning limitations and biases, and provide reasoning behind these biases by constructing influence graphs that identify phrases and words most responsible for biases manifested in LLMs. We further investigate biases such as round number bias and cognitive bias barrier revealed when noting framing effect in language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes