Rethinking External Slow-Thinking: From Snowball Errors to Probability of Correct Reasoning
This work addresses the problem of improving multi-step reasoning in LLMs for AI researchers, but it is incremental as it builds on existing slow-thinking methods without introducing a new paradigm.
The paper investigates the mechanisms of external slow-thinking methods in large language models, linking snowball errors to the probability of correct reasoning and showing that these methods reduce error probability, with findings suggesting that efficacy depends more on search scope or internal capacity than specific frameworks.
Test-time scaling, which is also often referred to as slow-thinking, has been demonstrated to enhance multi-step reasoning in large language models (LLMs). However, despite its widespread utilization, the mechanisms underlying slow-thinking methods remain poorly understood. This paper explores the mechanisms of external slow-thinking from a theoretical standpoint. We begin by examining the snowball error effect within the LLM reasoning process and connect it to the likelihood of correct reasoning using information theory. Building on this, we show that external slow-thinking methods can be interpreted as strategies to mitigate the error probability. We further provide a comparative analysis of popular external slow-thinking approaches, ranging from simple to complex, highlighting their differences and interrelationships. Our findings suggest that the efficacy of these methods is not primarily determined by the specific framework employed, and that expanding the search scope or the model's internal reasoning capacity may yield more sustained improvements in the long term. We open-source our code at https://github.com/ZyGan1999/Snowball-Errors-and-Probability.