CLAIFeb 19, 2025

SIFT: Grounding LLM Reasoning in Contexts via Stickers

arXiv:2502.14922v15 citationsh-index: 5Has CodeEMNLP
Originality Highly original
AI Analysis

This addresses context misinterpretation in LLMs, an incremental improvement for enhancing reasoning accuracy in tasks like math problem-solving.

The paper tackles the problem of large language models misinterpreting context during reasoning, such as misunderstanding phrases like 'per' in calculations, by introducing SIFT, a post-training method that uses self-generated 'Stickers' to ground reasoning, which improved DeepSeek-R1's accuracy on AIME2024 from 78.33% to 85.67%.

This paper identifies the misinterpretation of the context can be a significant issue during the reasoning process of large language models, spanning from smaller models like Llama3.2-3B-Instruct to cutting-edge ones like DeepSeek-R1. For example, in the phrase "10 dollars per kilo," LLMs might not recognize that "per" means "for each," leading to calculation errors. We introduce a novel, post-training approach called **Stick to the Facts (SIFT)** to tackle this. SIFT leverages increasing inference-time compute to ground LLM reasoning in contexts. At the core of SIFT lies the *Sticker*, which is generated by the model itself to explicitly emphasize the key information within the context. Given the curated Sticker, SIFT generates two predictions -- one from the original query and one from the query augmented with the Sticker. If they differ, the Sticker is sequentially refined via *forward* optimization (to better align the extracted facts with the query) and *inverse* generation (to conform with the model's inherent tendencies) for more faithful reasoning outcomes. Studies across diverse models (from 3B to 100B+) and benchmarks (e.g., GSM8K, MATH-500) reveal consistent performance improvements. Notably, SIFT improves the pass@1 accuracy of DeepSeek-R1 on AIME2024 from 78.33% to **85.67**%, establishing a new state-of-the-art in the open-source community. The code is available at https://github.com/zhijie-group/SIFT.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes