Method Drift›Speculative decoding
Token Recycling
Speculative decoding
superseded — cited as a baseline and beaten by newer methods
3 papers critique it · 5 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites Token Recycling as a baseline.
“Despite producing non-uniform trees, these methods optimize within a single draft source, so the quality differences they exploit remain within-source.”
— Goose: Anisotropic Speculation Trees for Training-Free Speculative Decoding“Notably, Token Recycle's performance remains flat despite increasing trajectories, unlike other model-free approaches. This limitation likely comes from its lookup table update strategy, which replaces rather than aggregates information from new trajectories.”
— Accelerated Test-Time Scaling with Model-Free Speculative Sampling“On the other head, model-free retrieval-based drafters, such as CopySpec~copyspec and Token Recycling~tokenrecycle, offer training-free and lightweight alternatives but suffer from limited retrieval quality.”
— When, What, and How: Rethinking Retrieval-Enhanced Speculative Decoding
Beaten on benchmarks
Head-to-head results where a newer method reports beating Token Recycling. Values are copied from the source paper's tables — verify against the cited paper.
- SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
suffix (tree) beats Token Recycling · mean accepted tokens per step [AgenticSQL]
6.236 vs 3.169
- SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
suffix (tree) beats Token Recycling · speedup over vanilla [AgenticSQL]
5.175 vs 2.710
- SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
hybrid (tree) beats Token Recycling · speedup over vanilla [Spec-Bench (non-agentic)]
4.684 vs 2.548
- Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact Match
FLy beats Token Recycling · Speedup [L31 70B, Temperature=0]
2.74 vs 2.08
- Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact Match
FLy beats Token Recycling · Speedup [L31 405B, Temperature=0]
4.80 vs 1.68
- Accelerated Test-Time Scaling with Model-Free Speculative Sampling
STAND beats Token Recycling · Throughput (tokens/sec) [DeepSeek-R1-Distill-Qwen-7B]
67.86 vs 63.91
- Accelerated Test-Time Scaling with Model-Free Speculative Sampling
STAND beats Token Recycling · Throughput (tokens/sec) [Batch Size 4]
128.10 vs 92.82
- Accelerated Test-Time Scaling with Model-Free Speculative Sampling
STAND beats Token Recycling · Throughput (tokens/sec) [Batch Size 8]
150.72 vs 101.36
- Accelerated Test-Time Scaling with Model-Free Speculative Sampling
STAND beats Token Recycling · Throughput (tokens/sec) [Diverse Verifier Tree Search]
83.51 vs 70.54
- OWL: Overcoming Window Length-Dependence in Speculative Decoding for Long-Context Inputs
OWL beats Token Recycling · acceptance_length [Llama-3.1-8B-Instruct]
4.00 vs 3.16
- OWL: Overcoming Window Length-Dependence in Speculative Decoding for Long-Context Inputs
OWL beats Token Recycling · acceptance_length [Llama-3.3-70B-Instruct]
4.27 vs 2.97
- SAM Decoding: Speculative Decoding via Suffix Automaton
SAM-Decoding beats Token Recycling · Speedup [HumanEval]
2.29 vs 1.94
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Collaborative Speculative Decoding (CoSpec)Beyond the Target: From Imitation to Collaboration in Speculative DecodingMay 24, 2026
- ToolSpecToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative DecodingApr 15, 2026
- QuasarQuasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient VerificationMar 2, 2026
- FLy (Training-Free Loosely Speculative Decoding)Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact MatchNov 28, 2025
- Nov 1, 2025
- Oct 30, 2025
- Oct 22, 2025
- Oct 8, 2025
- Group Tree Optimization (GTO)Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative DecodingSep 26, 2025
- Sep 22, 2025