Is Token Recycling superseded?

Token Recycling (Speculative decoding): superseded — cited as a baseline and beaten by newer methods. 3 paper(s) critique it, 5 beat it on benchmarks — #14 of 151 most-superseded. Sub-problem: cluster led by Lookahead. Newer alternatives in the same sub-problem include Collaborative Speculative Decoding (CoSpec), ToolSpec, Quasar, FLy (Training-Free Loosely Speculative Decoding), Pivot-Aware Speculative Decoding.

Method Drift›Speculative decoding

Superseded baseline#14 of 151 most-superseded

Token Recycling

Speculative decoding

superseded — cited as a baseline and beaten by newer methods

3 papers critique it · 5 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites Token Recycling as a baseline.

“Despite producing non-uniform trees, these methods optimize within a single draft source, so the quality differences they exploit remain within-source.”
— Goose: Anisotropic Speculation Trees for Training-Free Speculative Decoding
“Notably, Token Recycle's performance remains flat despite increasing trajectories, unlike other model-free approaches. This limitation likely comes from its lookup table update strategy, which replaces rather than aggregates information from new trajectories.”
— Accelerated Test-Time Scaling with Model-Free Speculative Sampling
“On the other head, model-free retrieval-based drafters, such as CopySpec~copyspec and Token Recycling~tokenrecycle, offer training-free and lightweight alternatives but suffer from limited retrieval quality.”
— When, What, and How: Rethinking Retrieval-Enhanced Speculative Decoding

Beaten on benchmarks

Head-to-head results where a newer method reports beating Token Recycling. Values are copied from the source paper's tables — verify against the cited paper.

suffix (tree) beats Token Recycling · mean accepted tokens per step [AgenticSQL]
6.236 vs 3.169
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
suffix (tree) beats Token Recycling · speedup over vanilla [AgenticSQL]
5.175 vs 2.710
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
hybrid (tree) beats Token Recycling · speedup over vanilla [Spec-Bench (non-agentic)]
4.684 vs 2.548
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
FLy beats Token Recycling · Speedup [L31 70B, Temperature=0]
2.74 vs 2.08
Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact Match
FLy beats Token Recycling · Speedup [L31 405B, Temperature=0]
4.80 vs 1.68
Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact Match
STAND beats Token Recycling · Throughput (tokens/sec) [DeepSeek-R1-Distill-Qwen-7B]
67.86 vs 63.91
Accelerated Test-Time Scaling with Model-Free Speculative Sampling
STAND beats Token Recycling · Throughput (tokens/sec) [Batch Size 4]
128.10 vs 92.82
Accelerated Test-Time Scaling with Model-Free Speculative Sampling
STAND beats Token Recycling · Throughput (tokens/sec) [Batch Size 8]
150.72 vs 101.36
Accelerated Test-Time Scaling with Model-Free Speculative Sampling
STAND beats Token Recycling · Throughput (tokens/sec) [Diverse Verifier Tree Search]
83.51 vs 70.54
Accelerated Test-Time Scaling with Model-Free Speculative Sampling
OWL beats Token Recycling · acceptance_length [Llama-3.1-8B-Instruct]
4.00 vs 3.16
OWL: Overcoming Window Length-Dependence in Speculative Decoding for Long-Context Inputs
OWL beats Token Recycling · acceptance_length [Llama-3.3-70B-Instruct]
4.27 vs 2.97
OWL: Overcoming Window Length-Dependence in Speculative Decoding for Long-Context Inputs
SAM-Decoding beats Token Recycling · Speedup [HumanEval]
2.29 vs 1.94
SAM Decoding: Speculative Decoding via Suffix Automaton

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.