Method Drift›Speculative decoding
Superseded baseline#101 of 151 most-superseded
Dynamic Depth Decoding
Dynamic Depth Decoding: Faster Speculative Decoding for LLMsSpeculative decoding · first seen Aug 30, 2024
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 0 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites Dynamic Depth Decoding as a baseline.
“However, unlike , none of these techniques focus on the data movement cost due to speculation. They require access to output probability distributions and are incompatible with approaches like n-gram speculation. Also, they rely on aggressive drafting, assuming very low over-speculation penalties (1\u20132\% per unit increase in K), and must draft/verify at least one token to estimate benefits. Moreover, such schemes introduce CPU to GPU communication between drafter iterations on the GPU, to apply policy heuristics. Consequently, stopping criteria are used infrequently--e.g., DDD defers until the $5^{th}$ drafter iteration—making these methods too costly for MoEs.”
— Utility-Driven Speculative Decoding for Mixture-of-Experts
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- Nov 3, 2025