Method Drift›Speculative decoding
EAGLE-2
EAGLE-2: Faster Inference of Language Models with Dynamic Draft TreesSpeculative decoding · first seen Jun 24, 2024
heavily superseded — a standard baseline that newer methods routinely beat
10 papers critique it · 21 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites EAGLE-2 as a baseline.
“to achieve the benefits claimed by EAGLE-2, the target model must verify nearly twice as many tokens, which is not a fair comparison. When aligning for the size of the token tree, the benefit of EAGLE-2 in terms of accept length is only 11%. Moreover, due to the additional latency introduced by the dynamic strategy, the wall-clock time of EAGLE-2 is more.”
— C2T: A Classifier-Based Tree Construction Method in Speculative Decoding“When directly applying speculative decoding algorithms on heterogeneous architectures, the acceleration effect is only improved by 1.57 times”
— Dovetail: A CPU/GPU Heterogeneous Speculative Decoding for LLM inference“it is not dynamic in the depth or width of the beam search. However, a dynamic depth would allow the draft model to be called a variable number of times per target model call, depending the likelihood of each sequence in the current beam being correct”
— Dynamic Depth Decoding: Faster Speculative Decoding for LLMs“In the speculative decoding context, EAGLE-2 adapts draft tree depth based on confidence, but does not consider compression.”
— SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection“The results of reliable experiments across various integration schemes show that EAGLE-2 provides limited benefit for 4-bit weight quantized models (W4A16 and W4A8), indicating a potential conflict.”
— Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design“EAGLE2 suffers from a token misalignment rate of 48% during training, resulting in suboptimal acceptance lengths and limiting its effectiveness”
— GRIFFIN: Effective Token Alignment for Faster Speculative Decoding“EAGLE-2 prunes low-confidence branches via the draft head's per-token scores, growing deeper along well-predicted paths; this adapts the shape to within-drafter quality variation but requires a trained head and lacks optimality guarantees.”
— Goose: Anisotropic Speculation Trees for Training-Free Speculative Decoding“EAGLE-2~li2024eagle used static trees with fixed-width branching, which are less effective under varying draft confidence”
— Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding“Building with EAGLE-2~li2024eagle2, HASS achieves 8\%-16\% acceptance length improvement over it”
— Learning Harmonized Representations for Speculative Sampling“Many SD pipelines require training a drafting model for multiple epochs over large datasets, or intensive model distillation from full teacher distributions over data ankner2024hydra, cai2024medusa, li2024eagle, li2024eagle-2, zhang2023draft.”
— Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding
Beaten on benchmarks
Head-to-head results where a newer method reports beating EAGLE-2. Values are copied from the source paper's tables — verify against the cited paper.
- C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
C2T β=0.85 beats EAGLE-2 · γ (candidate tokens) [E-2 TopN=26]
226569 vs 317668
- C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
C2T β=0.65 beats EAGLE-2 · γ (candidate tokens) [E-2 TopN=60]
531947 vs 628980
- SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
suffix (tree) beats EAGLE-2 · mean accepted tokens per step [AgenticSQL]
6.236 vs 3.572
- SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
suffix (tree) beats EAGLE-2 · speedup over vanilla [AgenticSQL]
5.175 vs 1.864
- SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
hybrid (tree) beats EAGLE-2 · speedup over vanilla [Spec-Bench (non-agentic)]
4.684 vs 3.466
- Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design
HierSpec beats EAGLE-2 · Tok/s [W4A16 Llama-3-70B, d=6]
98.19 vs 63.44
- Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design
HierSpec beats EAGLE-2 · Tok/s [W4A16 Llama-3-70B, d=7]
98.00 vs 63.44
- GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · SR [LLaMA2-7B, Temperature=0, MT-Bench]
3.12 vs 2.69
- GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · tau [LLaMA2-7B, Temperature=0, MT-Bench]
5.11 vs 4.50
- GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · SR [LLaMA2-7B, Temperature=0, Average]
3.28 vs 2.89
- GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · tau [LLaMA2-7B, Temperature=0, Average]
5.44 vs 4.82
- GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · SR [LLaMA3-8B, Temperature=0, MT-Bench]
3.09 vs 2.56
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- DREAM-SDREAM-S: Speculative Decoding with Searchable Drafting and Target-Aware Refinement for Multimodal GenerationMay 30, 2026
- May 14, 2026
- SpecForgeSpecForge: A Flexible and Efficient Open-Source Training Framework for Speculative DecodingMar 19, 2026
- Mar 13, 2026
- Feb 17, 2026
- Oct 22, 2025
- Oct 22, 2025
- Oct 17, 2025
- Draft, Verify, & Improve (DVI)Draft, Verify, and Improve: Toward Training-Aware Speculative DecodingOct 6, 2025
- FastGRPOFastGRPO: Accelerating Policy Optimization via Concurrency-aware Speculative Decoding and Online Draft LearningSep 26, 2025