CLAug 26, 2025

Thinking Before You Speak: A Proactive Test-time Scaling Approach

arXiv:2508.18648v21 citationsh-index: 23EMNLP
Originality Incremental advance
AI Analysis

This addresses deficiencies in LLMs for complex reasoning, offering a method to enhance performance without extensive human labeling or fine-tuning, though it appears incremental as it builds on existing prompting strategies.

The paper tackles the problem of large language models struggling with complex reasoning tasks like math by proposing a proactive test-time scaling approach that inserts insights between reasoning steps, achieving improved performance on challenging mathematical datasets.

Large Language Models (LLMs) often exhibit deficiencies with complex reasoning tasks, such as maths, which we attribute to the discrepancy between human reasoning patterns and those presented in the LLMs' training data. When dealing with complex problems, humans tend to think carefully before expressing solutions. However, they often do not articulate their inner thoughts, including their intentions and chosen methodologies. Consequently, critical insights essential for bridging reasoning steps may be absent in training data collected from human sources. To bridge this gap, we proposes inserting \emph{insight}s between consecutive reasoning steps, which review the status and initiate the next reasoning steps. Unlike prior prompting strategies that rely on a single or a workflow of static prompts to facilitate reasoning, \emph{insight}s are \emph{proactively} generated to guide reasoning processes. We implement our idea as a reasoning framework, named \emph{Thinking Before You Speak} (TBYS), and design a pipeline for automatically collecting and filtering in-context examples for the generation of \emph{insight}s, which alleviates human labeling efforts and fine-tuning overheads. Experiments on challenging mathematical datasets verify the effectiveness of TBYS. Project website: https://gitee.com/jswrt/TBYS

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes