Method Drift›Speculative decoding
Superseded baseline#71 of 151 most-superseded
SLED
SLED: A Speculative LLM Decoding Framework for Efficient Edge ServingSpeculative decoding · first seen Jun 11, 2025
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 0 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites SLED as a baseline.
“existing edge SD systems typically adopt simple batching policies, e.g., static batching in SLED”
— DiP-SD: Distributed Pipelined Speculative Decoding for Efficient LLM Inference at the Edge“Similarly, SLED li2025sled focuses on multi-client throughput but fails to provide deep latency masking for individual users.”
— A Pipelined Collaborative Speculative Decoding Framework for Efficient Edge-Cloud LLM Inference
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.