Method Drift›Speculative decoding
PEARL
PEARL: Prompting Large Language Models to Plan and Execute Actions Over Long DocumentsSpeculative decoding · first seen May 23, 2023
superseded — cited as a baseline and beaten by newer methods
4 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites PEARL as a baseline.
“However, unlike , none of these techniques focus on the data movement cost due to speculation. They require access to output probability distributions and are incompatible with approaches like n-gram speculation.”
— Utility-Driven Speculative Decoding for Mixture-of-Experts“Most similar to our work, pearl utilizes additional GPU resources to distribute the draft overhead. Our work differs by identifying and exploiting the potential of heterogeneous devices, while dynamically adapting the draft process based on available computational resources and draft outputs.”
— DuoDecoding: Hardware-aware Heterogeneous Speculative Decoding with Dynamic Multi-Sequence Drafting“PEARL~liu2025pearl introduces a parallel framework allowing concurrent operation of target and draft models; however, it faces challenges handling resource constraints due to competition when both models are colocated.”
— PACER: Blockwise Pre-verification for Speculative Decoding with Adaptive Length“PEARL~liu_parallel_2024 is only applicable when the draft model's inference cost per step is in the same order as the target model's.”
— AdaEAGLE: Optimizing Speculative Decoding via Explicit Modeling of Adaptive Draft Structures
Beaten on benchmarks
Head-to-head results where a newer method reports beating PEARL. Values are copied from the source paper's tables — verify against the cited paper.
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [GSM8k with Llama3 70B/1B]
4.72 vs 2.94
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [GSM8k with Llama2 70B/7B]
3.39 vs 2.27
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [GSM8k with Llama2 70B/7B Temp=1.0]
2.30 vs 2.12
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MGSM with Llama3 70B/1B]
4.00 vs 3.02
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MGSM with Llama2 70B/7B]
3.17 vs 2.37
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MGSM with Llama2 70B/7B Temp=1.0]
2.74 vs 2.17
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MT-Bench with Llama3 70B/1B]
3.08 vs 2.50
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MT-Bench with Llama2 70B/7B]
2.76 vs 2.16
- CARD: A Cache-Assisted Parallel Speculative Decoding Framework via Query-and-Correct Paradigm for Accelerating LLM Inference
CARD beats PEARL · speedup [MT-Bench with Llama2 70B/7B Temp=1.0]
2.68 vs 2.02
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Collaborative Speculative Decoding (CoSpec)Beyond the Target: From Imitation to Collaboration in Speculative DecodingMay 24, 2026
- ToolSpecToolSpec: Accelerating Tool Calling via Schema-Aware and Retrieval-Augmented Speculative DecodingApr 15, 2026
- QuasarQuasar: Quantized Self-Speculative Acceleration for Rapid Inference via Memory-Efficient VerificationMar 2, 2026
- FLy (Training-Free Loosely Speculative Decoding)Training-Free Loosely Speculative Decoding: Accepting Semantically Correct Drafts Beyond Exact MatchNov 28, 2025
- Nov 1, 2025
- Oct 30, 2025
- Oct 22, 2025
- Oct 8, 2025
- Group Tree Optimization (GTO)Bridging Draft Policy Misalignment: Group Tree Optimization for Speculative DecodingSep 26, 2025
- Sep 22, 2025