PEARL (Speculative decoding): superseded — cited as a baseline and beaten by newer methods. 4 paper(s) critique it, 1 beat it on benchmarks — #19 of 151 most-superseded. Sub-problem: cluster led by Lookahead. Newer alternatives in the same sub-problem include Collaborative Speculative Decoding (CoSpec), ToolSpec, Quasar, FLy (Training-Free Loosely Speculative Decoding), Pivot-Aware Speculative Decoding.

Superseded baseline#19 of 151 most-superseded

PEARL

PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long Documents

Speculative decoding · first seen May 23, 2023

superseded — cited as a baseline and beaten by newer methods

4 papers critique it · 1 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites PEARL as a baseline.

“However, unlike , none of these techniques focus on the data movement cost due to speculation. They require access to output probability distributions and are incompatible with approaches like n-gram speculation.”
— Utility-Driven Speculative Decoding for Mixture-of-Experts
“Most similar to our work, pearl utilizes additional GPU resources to distribute the draft overhead. Our work differs by identifying and exploiting the potential of heterogeneous devices, while dynamically adapting the draft process based on available computational resources and draft outputs.”
— DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting
“PEARL~liu2025pearl introduces a parallel framework allowing concurrent operation of target and draft models; however, it faces challenges handling resource constraints due to competition when both models are colocated.”
— PACER: Blockwise Pre-verification for Speculative Decoding with Adaptive Length
“PEARL~liu_parallel_2024 is only applicable when the draft model's inference cost per step is in the same order as the target model's.”
— AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures

Beaten on benchmarks

Head-to-head results where a newer method reports beating PEARL. Values are copied from the source paper's tables — verify against the cited paper.

CARD beats PEARL · speedup [GSM8k with Llama3 70B/1B]
4.72 vs 2.94
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [GSM8k with Llama2 70B/7B]
3.39 vs 2.27
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [GSM8k with Llama2 70B/7B Temp=1.0]
2.30 vs 2.12
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MGSM with Llama3 70B/1B]
4.00 vs 3.02
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MGSM with Llama2 70B/7B]
3.17 vs 2.37
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MGSM with Llama2 70B/7B Temp=1.0]
2.74 vs 2.17
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MT-Bench with Llama3 70B/1B]
3.08 vs 2.50
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MT-Bench with Llama2 70B/7B]
2.76 vs 2.16
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MT-Bench with Llama2 70B/7B Temp=1.0]
2.68 vs 2.02
CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.