Method Drift›KV-cache compression
Superseded baseline#55 of 234 most-superseded
EPIC
EPIC: Efficient Position-Independent Caching for Serving Large Language ModelsKV-cache compression · first seen Oct 20, 2024
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites EPIC as a baseline.
“Nonetheless, existing systems (, vLLM, CacheBlend, CacheCraft, Epic)~kwon2023efficient,yao2025cacheblend,agarwal2025cache,hu2024epic operate at coarse granularity and thus fundamentally cannot support selective sharing: they reuse KV cache at the level of fixed chunks (e.g., 512 tokens~agarwal2025cache) or entire prompt, so the presence of a single sensitive token (e.g., PII) invalidates the whole unit and discards most otherwise reusable content”
— CachePrune: Privacy-Aware and Fine-Grained KV Cache Sharing for Efficient LLM Inference“rely on exact context matching, which is unsuitable for real user scenarios”
— SemShareKV: Efficient KVCache Sharing for Semantically Similar Prompts via Token-Level LSH Matching
Beaten on benchmarks
Head-to-head results where a newer method reports beating EPIC. Values are copied from the source paper's tables — verify against the cited paper.
- CachePrune: Privacy-Aware and Fine-Grained KV Cache Sharing for Efficient LLM Inference
CachePrune beats EPIC · TTFT [WildChat]
134 vs 165
- CachePrune: Privacy-Aware and Fine-Grained KV Cache Sharing for Efficient LLM Inference
CachePrune beats EPIC · TTFT [ShareGPT]
177 vs 209
- CachePrune: Privacy-Aware and Fine-Grained KV Cache Sharing for Efficient LLM Inference
CachePrune beats EPIC · TTFT [LMSys]
74 vs 94
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 22, 2026
- Apr 28, 2026
- Predictive Multi-Tier Memory ManagementPredictive Multi-Tier Memory Management for KV Cache in Large-Scale GPU InferenceApr 19, 2026
- TableCacheTableCache: Primary Foreign Key Guided KV Cache Precomputation for Low Latency Text-to-SQLJan 13, 2026
- Jan 5, 2026
- SemShareKVSemShareKV: Efficient KVCache Sharing for Semantically Similar Prompts via Token-Level LSH MatchingSep 29, 2025