Method Drift›KV-cache compression
Superseded baseline#124 of 234 most-superseded
Block diffusion
Block Diffusion: Interpolating Between Autoregressive and Diffusion Language ModelsKV-cache compression · first seen Mar 12, 2025
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 0 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites Block diffusion as a baseline.
“This requires considering the KV-Cache in training, making twice the forward computation in the training and its form is still constrained in the autoregressive formula.”
— dKV-Cache: The Cache for Diffusion Language Models
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.