Method Drift›KV-cache compression
Superseded baseline#209 of 234 most-superseded
simple top-K KV cache pruning
KV-cache compression
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 0 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites simple top-K KV cache pruning as a baseline.
“this strategy struggles to determine the optimal time for pruning, often leading to excessive pruning. This, in turn, results in the loss of crucial visual information, negatively impacting the overall performance of the model.”
— PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.