Method Drift›KV-cache compression
Superseded baseline#37 of 234 most-superseded
PagedAttention
KV-cache compression
superseded — cited as a baseline and beaten by newer methods
5 papers critique it · 0 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites PagedAttention as a baseline.
“System-level approaches like PagedAttention optimize KV storage to mitigate fragmentation, yet remain agnostic to the semantic importance of content under tight budgets.”
— ArborKV: Structure-Aware KV Cache Management for Scaling Tree-based LLM Reasoning“Orthogonally, system-level solutions like PagedAttention kwon2023efficientmemorymanagementlarge optimize memory management but do not reduce the fundamental size of the cache itself.”
— SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression“Despite these advancements, no existing long-context benchmarks evaluate KV cache reuse scenarios.”
— SCBench: A KV Cache-Centric Analysis of Long-Context Methods“applies paging techniques to reduce memory fragmentation but maintains full precision storage”
— GPU-Accelerated INT8 Quantization for KV Cache Compression in Large Language Models“These systems treat all transferred KV at uniform precision; adds per-token precision differentiation”
— SpectrumKV: Per-Token Mixed-Precision KV Cache Transfer for Prefill-Decode Disaggregated LLM Serving
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 21, 2026
- May 8, 2026
- Mar 24, 2026
- Mar 17, 2026
- Mar 15, 2026
- Feb 5, 2026
- Jan 29, 2026
- GPU-ccelerated INT8 quantization for KV cache compressionGPU-Accelerated INT8 Quantization for KV Cache Compression in Large Language ModelsJan 8, 2026
- STA-AttentionUnlocking the Address Book: Dissecting the Sparse Semantic Structure of LLM Key-Value Caches via Sparse AutoencodersDec 11, 2025
- SWANSWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache CompressionNov 24, 2025
- Oct 28, 2025
- Sep 25, 2025