Asking LLMs to Verify First is Almost Free Lunch
This addresses the challenge of improving LLM reasoning efficiency for users needing low-cost, scalable solutions, though it is incremental as it builds on existing prompting methods.
The paper tackles the problem of enhancing reasoning in Large Language Models (LLMs) without high costs by introducing Verification-First (VF), a strategy that prompts models to verify a candidate answer before generating a solution, and shows that VF with random answers consistently outperforms standard Chain-of-Thought (CoT) with minimal overhead, and Iter-VF outperforms existing test-time scaling strategies across various benchmarks and models.
To enhance the reasoning capabilities of Large Language Models (LLMs) without high costs of training, nor extensive test-time sampling, we introduce Verification-First (VF), a strategy that prompts models to verify a provided candidate answer, even a trivial or random one, before generating a solution. This approach triggers a "reverse reasoning" process that is cognitively easier and complementary to standard forward Chain-of-Thought (CoT), effectively invoking the model's critical thinking to reduce logical errors. We further generalize the VF strategy to Iter-VF, a sequential test-time scaling (TTS) method that iteratively cycles the verification-generation process using the model's previous answer. Extensive experiments across various benchmarks (from mathematical reasoning to coding and agentic tasks) and various LLMs (from open-source 1B to cutting-edge commercial ones) confirm that VF with random answer consistently outperforms standard CoT with minimal computational overhead, and Iter-VF outperforms existing TTS strategies.