Method Drift›Speculative decoding
RSD
Reward-Guided Speculative Decoding for Efficient LLM ReasoningSpeculative decoding · first seen Jan 31, 2025
superseded — cited as a baseline and beaten by newer methods
3 papers critique it · 2 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites RSD as a baseline.
“Existing approximation algorithms such as K-SEQ sun2023spectr, MSS miao2024specinfer, RSD jeon2024recursive, and LP and vocabulary truncation khisti2025multi cannot guarantee near-optimal speedups: K-SEQ has a $(1-1/e)$-approximation guarantee, but the others have no formal guarantees.”
— Global Resolution: Optimal Multi-Draft Speculative Sampling via Convex Minimization“However, reward models are neural networks that require additional forward passes, substantially increasing the computational cost of SD.”
— Entropy-Aware Speculative Decoding Toward Improved LLM Reasoning“Although effective in improving reliability, it incurs substantial drawbacks. First, reliance on external verifiers significantly increases latency and compute overhead. Second, pre-trained reward models are often specialized to specific tasks, making them difficult to generalize across diverse reasoning tasks.”
— From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
Beaten on benchmarks
Head-to-head results where a newer method reports beating RSD. Values are copied from the source paper's tables — verify against the cited paper.
- Entropy-Aware Speculative Decoding Toward Improved LLM Reasoning
EASD beats RSD · Average Accuracy [32B target model + 7B draft model]
52.89 vs 51.88
- Entropy-Aware Speculative Decoding Toward Improved LLM Reasoning
EASD beats RSD · Average Accuracy [72B target model + 7B draft model]
52.12 vs 50.81
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · MATH500 [Math Model, Draft and Target: Qwen2.5-Math-Instruct]
85.4 vs 82.4
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · GSM8K [Math Model, Draft and Target: Qwen2.5-Math-Instruct]
95.8 vs 94.4
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · Gaokao 2023 En [Math Model, Draft and Target: Qwen2.5-Math-Instruct]
69.4 vs 68.5
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · Olympiad Bench [Math Model, Draft and Target: Qwen2.5-Math-Instruct]
41.2 vs 39.6
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · MATH500 [General Model, Draft and Target: Qwen2.5-Instruct]
77.0 vs 71.4
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · GSM8K [General Model, Draft and Target: Qwen2.5-Instruct]
93.0 vs 90.1
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · Gaokao 2023 En [General Model, Draft and Target: Qwen2.5-Instruct]
66.0 vs 60.5
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · Olympiad Bench [General Model, Draft and Target: Qwen2.5-Instruct]
40.3 vs 37.6
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · MATH500 [General Model, Draft: Llama-3.2-Instruct and Target: Llama-3.1-Instruct]
51.6 vs 50.0
- From Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step Reasoning
InferSpec beats RSD · GSM8K [General Model, Draft: Llama-3.2-Instruct and Target: Llama-3.1-Instruct]
85.1 vs 83.9
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- FVO-SpecFuture Validity is the Missing Statistic: From Impossibility to $Φ$-Estimation for Grammar-Faithful Speculative DecodingMay 8, 2026
- From Tokens to StepsFrom Tokens to Steps: Verification-Aware Speculative Decoding for Efficient Multi-Step ReasoningApr 16, 2026
- Entropy-Aware Speculative Decoding (EASD)Entropy-Aware Speculative Decoding Toward Improved LLM ReasoningDec 29, 2025
- Global ResolutionGlobal Resolution: Optimal Multi-Draft Speculative Sampling via Convex MinimizationNov 19, 2025