CaM (KV-cache compression): superseded — cited as a baseline and beaten by newer methods. 6 paper(s) critique it, 6 beat it on benchmarks — #9 of 234 most-superseded. Sub-problem: cluster led by SnapKV. Newer alternatives in the same sub-problem include STaR-KV, GRKV, MomentKV, NestedKV, IndexMem.

Superseded baseline#9 of 234 most-superseded

CaM

KV-cache compression

superseded — cited as a baseline and beaten by newer methods

6 papers critique it · 6 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites CaM as a baseline.

“these single-modal optimizations exhibit limited efficacy in MLLMs due to cross-modal distribution shifts and attention pattern divergence, failing to preserve modality-specific information fidelity.”
— FlowMM: Cross-Modal Information Flow Guided KV Cache Merging for Efficient Multimodal Context Inference
“In contrast, CaM zhang2024cam adaptively merges evicted value states into others but does not merge the corresponding keys.”
— KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference
“most merging methods rely on local heuristics, they often funnel many evicted tokens into a small set of span-boundary tokens. These boundary tokens therefore become the main information carriers, overloading their representations and making them prone to over-merging: excessive aggregation can blur or even erase their original semantics, thereby degrading overall performance.”
— GRKV: Global Regression for Training-Free KV Cache Compression in Long-Context LLMs
“CaM merges the values of evicted tokens only probabilistically, with a non-negligible probability of discarding them and hence losing information.”
— WeightedKV: Attention Scores Weighted Key-Value Cache Merging for Large Language Models
“a fundamental limitation of these approaches is their uniform treatment of keys and values during merging despite their distinct distributional characteristics.”
— Homogeneous Keys, Heterogeneous Values: Exploiting Local KV Cache Asymmetry for Long-Context LLMs
“Notably, both Quest and CaM report results only on LLaMA2, without comparisons to other eviction methods, limiting their relevance to current frontier models.”
— CAOTE: KV Cache Selection for LLMs via Attention Output Error-Based Token Eviction

Beaten on benchmarks

Head-to-head results where a newer method reports beating CaM. Values are copied from the source paper's tables — verify against the cited paper.

SemantiCache beats CaM · Average score [Llama-3-8B, 20% cache budget]
30.01 vs 27.86
SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging
SemantiCache beats CaM · Average score [Mistral-7B, 20% cache budget]
39.68 vs 35.27
SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging
SemantiCache beats CaM · Accuracy [Mistral-7B, L=32k, cache budget 1024]
91.02 vs 86.52
SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging
SemantiCache beats CaM · TPOT (s) [Llama-3-8B, 32k context]
0.031 vs 0.039
SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging
SemantiCache beats CaM · Memory (GB) [Llama-3-8B, 32k context]
15.94 vs 17.03
SemantiCache: Efficient KV Cache Compression via Semantic Chunking and Clustered Merging
Meta-Soft beats CaM · Avg [All context lengths]
75.72 vs 68.20
Meta-Soft: Leveraging Composable Meta-Tokens for Context-Preserving KV Cache Compression
KeepKV beats CaM · NrtvQA [Llama-2-7B]
17.32 vs 11.79
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference
KeepKV beats CaM · Qasper [Llama-2-7B]
7.48 vs 5.1
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference
KeepKV beats CaM · MF-en [Llama-2-7B]
22.2 vs 19.12
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference
KeepKV beats CaM · HotpotQA [Llama-2-7B]
8.51 vs 7.26
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference
KeepKV beats CaM · Musique [Llama-2-7B]
4.65 vs 3.64
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference
KeepKV beats CaM · TriviaQA [Llama-2-7B]
88.87 vs 87.31
KeepKV: Eliminating Output Perturbation in KV Cache Compression for Efficient LLMs Inference

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.