LGAICLSEFeb 23, 2025

DISC: Dynamic Decomposition Improves LLM Inference Scaling

arXiv:2502.16706v310 citationsh-index: 13
Originality Incremental advance
AI Analysis

This addresses the challenge of compute allocation in LLM inference for tasks like coding and math, offering an incremental improvement over static decomposition methods.

The paper tackles the problem of inefficient inference scaling in large language models by proposing dynamic decomposition, which adaptively partitions reasoning traces into steps during inference, resulting in reduced error rates by 5.0% to 10.5% on benchmarks like APPS, MATH, and LiveCodeBench.

Inference scaling methods for LLMs often rely on decomposing problems into steps (or groups of tokens), followed by sampling and selecting the best next steps. However, these steps and their sizes are often predetermined or manually designed based on domain knowledge. We propose dynamic decomposition, a method that adaptively and automatically partitions solution and reasoning traces into manageable steps during inference. By more effectively allocating compute -- particularly through subdividing challenging steps and prioritizing their sampling -- dynamic decomposition significantly improves inference efficiency. Experiments on benchmarks such as APPS, MATH, and LiveCodeBench demonstrate that dynamic decomposition outperforms static approaches, including token-level, sentence-level, and single-step decompositions, reducing the pass@10 error rate by 5.0%, 6.7%, and 10.5% respectively. These findings highlight the potential of dynamic decomposition to improve a wide range of inference scaling techniques.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes