Reason from Future: Reverse Thought Chain Enhances LLM Reasoning
This addresses the issue of inefficient reasoning in small language models for complex tasks, though it appears incremental as it builds on existing paradigms like Chain-of-Thought.
The paper tackles the problem of local optimum reasoning in language models by proposing a novel reasoning paradigm called Reason from Future (RFF), which uses bidirectional reasoning to reduce searching space and error accumulation, resulting in higher accuracy and less searching space in complex tasks.
It has been demonstrated that carefully designed reasoning paradigms, like Chain-of-Thought (CoT) and Tree-of-Thought (ToT), can enhance the reasoning capabilities of small language models by detailed thinking and extensive thought searching, unbounded branching factors in the searching space create prohibitive reasoning consumption. However these methods fall into the trap of local optimum reasoning, which means the model lacks a global perspective while solving problems. We propose a novel reasoning paradigm called Reason from Future (RFF), which generates reasoning paths by bidirectional reasoning that combines top-down planning with bottom-up reasoning accumulation. The essence of RFF lies in its reverse reasoning mechanism, which prioritizes core logical relationships and imposes goal-oriented constraints on intermediate steps, thereby reducing the searching space and mitigating error accumulation inherent in sequential forward reasoning. Empirical evaluations across diverse experiments demonstrate that RFF outperforms conventional paradigms with higher accuracy and less searching space to solve complex tasks.