Method Drift›KV-cache compression
Tracked
IsoQuant
IsoQuant: Hardware-Aligned SO(4) Isoclinic Rotations for LLM KV Cache CompressionKV-cache compression · first seen Mar 30, 2026
current frontier — recent, not yet superseded in the knowledge base
0 papers critique it · 0 beat it on benchmarks
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- SpectrumKVSpectrumKV: Per-Token Mixed-Precision KV Cache Transfer for Prefill-Decode Disaggregated LLM ServingJun 7, 2026
- Hurwitz Quaternion Multiplicative Quantization (HQMQ)Hurwitz Quaternion Multiplicative Quantization for KV Cache CompressionMay 26, 2026
- May 18, 2026
- May 18, 2026
- TriAxialKVTriAxialKV: Toward Extreme Low-Precision KV-Cache Quantization for Agentic Inference TasksMay 16, 2026
- KVServeKVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM ServingMay 13, 2026
- WindowQuantWindowQuant: Mixed-Precision KV Cache Quantization based on Window-Level Similarity for VLMs Inference OptimizationMay 4, 2026
- Apr 21, 2026
- eOptShrinkQeOptShrinkQ: Near-Lossless KV Cache Compression Through Optimal Spectral Denoising and QuantizationApr 6, 2026
- Apr 3, 2026
- Mar 30, 2026
- Mar 29, 2026