Method Drift›KV-cache compression

Superseded baseline#209 of 234 most-superseded

simple top-K KV cache pruning

KV-cache compression

superseded — cited as a baseline and beaten by newer methods

1 papers critique it · 0 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites simple top-K KV cache pruning as a baseline.

“this strategy struggles to determine the optimal time for pruning, often leading to excessive pruning. This, in turn, results in the loss of crucial visual information, negatively impacting the overall performance of the model.”
— PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.

PruneHal PruneHal: Reducing Hallucinations in Multi-modal Large Language Models through Adaptive KV Cache Pruning
Oct 22, 2025