First Heuristic Then Rational: Dynamic Use of Heuristics in Language Model Reasoning
This incremental insight into LM reasoning behavior could help improve prompting strategies for researchers and practitioners in natural language processing.
The study investigated how language models (LMs) use heuristics like lexical overlap during multi-step reasoning, finding that reliance on heuristics is higher in earlier stages and decreases as they approach the final answer, indicating dynamic strategy combination.
Multi-step reasoning instruction, such as chain-of-thought prompting, is widely adopted to explore better language models (LMs) performance. We report on the systematic strategy that LMs employ in such a multi-step reasoning process. Our controlled experiments reveal that LMs rely more heavily on heuristics, such as lexical overlap, in the earlier stages of reasoning, where more reasoning steps remain to reach a goal. Conversely, their reliance on heuristics decreases as LMs progress closer to the final answer through multiple reasoning steps. This suggests that LMs can backtrack only a limited number of future steps and dynamically combine heuristic strategies with rationale ones in tasks involving multi-step reasoning.