CLSep 17, 2025

Thinking in a Crowd: How Auxiliary Information Shapes LLM Reasoning

Haodong Zhao, Chenyan Zhao, Yansi Li, Zhuosheng Zhang, Gongshen Liu

arXiv:2509.18163v18.34 citationsh-index: 10Has Code

Originality Incremental advance

AI Analysis

This addresses a critical vulnerability in LLMs for real-world applications where external information can be unreliable, highlighting the need for better evaluation methods.

The paper investigates how auxiliary information affects LLM reasoning, finding that while helpful context improves accuracy, misleading information causes a catastrophic performance drop, amplified by step-by-step thinking.

The capacity of Large Language Models (LLMs) to reason is fundamental to their application in complex, knowledge-intensive domains. In real-world scenarios, LLMs are often augmented with external information that can be helpful, irrelevant, or even misleading. This paper investigates the causal impact of such auxiliary information on the reasoning process of LLMs with explicit step-by-step thinking capabilities. We introduce SciAux, a new dataset derived from ScienceQA, to systematically test the robustness of the model against these types of information. Our findings reveal a critical vulnerability: the model's deliberative "thinking mode" is a double-edged sword. While helpful context improves accuracy, misleading information causes a catastrophic drop in performance, which is amplified by the thinking process. Instead of conferring robustness, thinking reinforces the degree of error when provided with misinformation. This highlights that the challenge is not merely to make models "think", but to endow them with the critical faculty to evaluate the information upon which their reasoning is based. The SciAux dataset is available at https://huggingface.co/datasets/billhdzhao/SciAux.

View on arXiv PDF

Similar