Is EAGLE-2 superseded?

EAGLE-2 (Speculative decoding): heavily superseded — a standard baseline that newer methods routinely beat. 10 paper(s) critique it, 21 beat it on benchmarks — #2 of 151 most-superseded. Sub-problem: cluster led by EAGLE-2. Newer alternatives in the same sub-problem include DREAM-S, PPOW, SpecForge, OnlineSpec, MoE-Spec.

Method Drift›Speculative decoding

Heavily superseded#2 of 151 most-superseded

EAGLE-2

EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees

Speculative decoding · first seen Jun 24, 2024

heavily superseded — a standard baseline that newer methods routinely beat

10 papers critique it · 21 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites EAGLE-2 as a baseline.

“to achieve the benefits claimed by EAGLE-2, the target model must verify nearly twice as many tokens, which is not a fair comparison. When aligning for the size of the token tree, the benefit of EAGLE-2 in terms of accept length is only 11%. Moreover, due to the additional latency introduced by the dynamic strategy, the wall-clock time of EAGLE-2 is more.”
— C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
“When directly applying speculative decoding algorithms on heterogeneous architectures, the acceleration effect is only improved by 1.57 times”
— Dovetail: A CPU/GPU Heterogeneous Speculative Decoding for LLM inference
“it is not dynamic in the depth or width of the beam search. However, a dynamic depth would allow the draft model to be called a variable number of times per target model call, depending the likelihood of each sequence in the current beam being correct”
— Dynamic Depth Decoding: Faster Speculative Decoding for LLMs
“In the speculative decoding context, EAGLE-2 adapts draft tree depth based on confidence, but does not consider compression.”
— SpecKV: Adaptive Speculative Decoding with Compression-Aware Gamma Selection
“The results of reliable experiments across various integration schemes show that EAGLE-2 provides limited benefit for 4-bit weight quantized models (W4A16 and W4A8), indicating a potential conflict.”
— Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design
“EAGLE2 suffers from a token misalignment rate of 48% during training, resulting in suboptimal acceptance lengths and limiting its effectiveness”
— GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
“EAGLE-2 prunes low-confidence branches via the draft head's per-token scores, growing deeper along well-predicted paths; this adapts the shape to within-drafter quality variation but requires a trained head and lacks optimality guarantees.”
— Goose: Anisotropic Speculation Trees for Training-Free Speculative Decoding
“EAGLE-2~li2024eagle used static trees with fixed-width branching, which are less effective under varying draft confidence”
— Spec-LLaVA: Accelerating Vision-Language Models with Dynamic Tree-Based Speculative Decoding
“Building with EAGLE-2~li2024eagle2, HASS achieves 8\%-16\% acceptance length improvement over it”
— Learning Harmonized Representations for Speculative Sampling
“Many SD pipelines require training a drafting model for multiple epochs over large datasets, or intensive model distillation from full teacher distributions over data ankner2024hydra, cai2024medusa, li2024eagle, li2024eagle-2, zhang2023draft.”
— Draft, Verify, and Improve: Toward Training-Aware Speculative Decoding

Beaten on benchmarks

Head-to-head results where a newer method reports beating EAGLE-2. Values are copied from the source paper's tables — verify against the cited paper.

C2T β=0.85 beats EAGLE-2 · γ (candidate tokens) [E-2 TopN=26]
226569 vs 317668
C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
C2T β=0.65 beats EAGLE-2 · γ (candidate tokens) [E-2 TopN=60]
531947 vs 628980
C2T: A Classifier-Based Tree Construction Method in Speculative Decoding
suffix (tree) beats EAGLE-2 · mean accepted tokens per step [AgenticSQL]
6.236 vs 3.572
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
suffix (tree) beats EAGLE-2 · speedup over vanilla [AgenticSQL]
5.175 vs 1.864
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
hybrid (tree) beats EAGLE-2 · speedup over vanilla [Spec-Bench (non-agentic)]
4.684 vs 3.466
SuffixDecoding: Extreme Speculative Decoding for Emerging AI Applications
HierSpec beats EAGLE-2 · Tok/s [W4A16 Llama-3-70B, d=6]
98.19 vs 63.44
Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design
HierSpec beats EAGLE-2 · Tok/s [W4A16 Llama-3-70B, d=7]
98.00 vs 63.44
Speculative Decoding Meets Quantization: Compatibility Evaluation and Hierarchical Framework Design
GRIFFIN beats EAGLE-2 · SR [LLaMA2-7B, Temperature=0, MT-Bench]
3.12 vs 2.69
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · tau [LLaMA2-7B, Temperature=0, MT-Bench]
5.11 vs 4.50
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · SR [LLaMA2-7B, Temperature=0, Average]
3.28 vs 2.89
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · tau [LLaMA2-7B, Temperature=0, Average]
5.44 vs 4.82
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding
GRIFFIN beats EAGLE-2 · SR [LLaMA3-8B, Temperature=0, MT-Bench]
3.09 vs 2.56
GRIFFIN: Effective Token Alignment for Faster Speculative Decoding

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.