Method Drift›KV-cache compression
Tracked
ArborKV
ArborKV: Structure-Aware KV Cache Management for Scaling Tree-based LLM ReasoningKV-cache compression · first seen May 21, 2026
current frontier — recent, not yet superseded in the knowledge base
0 papers critique it · 0 beat it on benchmarks
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 21, 2026
- May 8, 2026
- Mar 24, 2026
- Mar 17, 2026
- Mar 15, 2026
- Feb 5, 2026
- Jan 29, 2026
- GPU-ccelerated INT8 quantization for KV cache compressionGPU-Accelerated INT8 Quantization for KV Cache Compression in Large Language ModelsJan 8, 2026
- STA-AttentionUnlocking the Address Book: Dissecting the Sparse Semantic Structure of LLM Key-Value Caches via Sparse AutoencodersDec 11, 2025
- SWANSWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache CompressionNov 24, 2025
- Oct 28, 2025
- Sep 25, 2025