Method DriftSpeculative decoding

Superseded baseline#15 of 151 most-superseded

SWIFT

Speculative decoding

superseded — cited as a baseline and beaten by newer methods

4 papers critique it · 2 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites SWIFT as a baseline.

  • While SWIFT only allows a fixed skipping rate, Conflayers does not limit the search space to a certain number of layers to skip and expands the exploration set to any number of layers below or above a pre-defined threshold while conditioning the search on the performance of the draft model.
    ConfLayers: Adaptive Confidence-based Layer Skipping for Self-Speculative Decoding
  • requires substantial workload-specific tuning to isolate layers and operations that can be skipped while maintaining high token acceptance rates
    HiSpec: Hierarchical Speculative Decoding for LLMs
  • The most closely related framework to ours is SWIFT (Xia et al., 2025), which adaptively selects subsets of layers to skip during inference under a speculative decoding paradigm. By treating the same LLM as both draft and verifier via dynamic layer selection, SWIFT achieves lossless acceleration without introducing new modules or supervision. However, SWIFT still requires iterative Bayesian optimization to identify the optimal layer subsets, which can be computationally expensive prior to deployment.
    SDFP: Speculative Decoding with FIT-Pruned Models for Training-Free and Plug-and-Play LLM Acceleration
  • All existing self-speculative methods share a common assumption: the model is a homogeneous stack of similar layers, and the drafting strategy consists of skipping or shortcutting some of these layers. This assumption breaks down in hybrid architectures, where layers contain fundamentally different computational components.
    Component-Aware Self-Speculative Decoding in Hybrid Language Models

Beaten on benchmarks

Head-to-head results where a newer method reports beating SWIFT. Values are copied from the source paper's tables — verify against the cited paper.

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.