Method Drift

Living systematic review

Speculative decoding

Speeding up autoregressive LLM generation by drafting tokens cheaply and verifying them in parallel.

182 papers · 333 critique receipts · 1,849 benchmark results · updated Jun 18, 2026

Most-superseded baselines

Ranked by how many distinct papers critique or beat each method. These are the standard baselines newer work routinely measures against.

  1. 2
    EAGLE-2· EAGLE-2
    EAGLE-2: Faster Inference of Language Models with Dynamic Draft Trees

    10 papers critique it · 21 beat it on benchmarks

  2. 3
    EAGLE· EAGLE-2
    EAGLE: Speculative Sampling Requires Rethinking Feature Uncertainty

    19 papers critique it · 12 beat it on benchmarks

  3. 4
    Medusa· EAGLE-2
    Medusa: Simple LLM Inference Acceleration Framework with Multiple Decoding Heads

    19 papers critique it · 9 beat it on benchmarks

  4. 6
    PLD· Lookahead
    PLD+: Accelerating LLM inference by leveraging Language Model Artifacts

    7 papers critique it · 13 beat it on benchmarks

  5. 7
    REST· Lookahead
    REST: Retrieval-Based Speculative Decoding

    6 papers critique it · 9 beat it on benchmarks

  6. 9
    Speculative Sampling for Parametric Temporal Point Processes

    2 papers critique it · 10 beat it on benchmarks

  7. 10
    DFlash· EAGLE-3
    DFlash: Block Diffusion for Flash Speculative Decoding

    4 papers critique it · 6 beat it on benchmarks

  8. 11
    LayerSkip· SpecInfer
    LayerSkip: Enabling Early Exit Inference and Self-Speculative Decoding

    7 papers critique it · 3 beat it on benchmarks

Sub-problems

Methods that compete on the same benchmarks cluster into distinct sub-problems.

The frontier

Recent methods not yet superseded in the knowledge base.