Method Drift›KV-cache compression
Tracked
OrbitFlow
OrbitFlow: SLO-Aware Long-Context LLM Serving with Fine-Grained KV Cache ReconfigurationKV-cache compression · first seen Jan 5, 2026
current frontier — recent, not yet superseded in the knowledge base
0 papers critique it · 0 beat it on benchmarks
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 22, 2026
- Apr 28, 2026
- Predictive Multi-Tier Memory ManagementPredictive Multi-Tier Memory Management for KV Cache in Large-Scale GPU InferenceApr 19, 2026
- TableCacheTableCache: Primary Foreign Key Guided KV Cache Precomputation for Low Latency Text-to-SQLJan 13, 2026
- Jan 5, 2026
- SemShareKVSemShareKV: Efficient KVCache Sharing for Semantically Similar Prompts via Token-Level LSH MatchingSep 29, 2025