Method Drift›KV-cache compression
ReKV
Streaming Video Question-Answering with In-context Video KV-Cache RetrievalKV-cache compression · first seen Mar 1, 2025
superseded — cited as a baseline and beaten by newer methods
6 papers critique it · 2 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites ReKV as a baseline.
“AGX+ReKV's coarse, frame-level selection offers modest latency gains but still requires selecting many tokens to maintain accuracy, limiting its effectiveness.”
— V-Rex: Real-Time Streaming Video LLM Acceleration via Dynamic KV Cache Retrieval“per-frame KV caches are insufficient for video modeling (lacking spatial details and temporal contexts) whilst consuming large amounts of storage that in turn reduces retrieval and QA performance”
— MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering“KV cache offloading (e.g., ReKV) expands memory space yet incurs costly data transfer, repeated for each query.”
— InfiniPot-V: Memory-Constrained KV Cache Compression for Streaming Video Understanding“Although retaining full Key-Value (KV) cache in the CPU and retrieving them based on queries facilitates the preservation and retrieval of fine-grained information, it leads to severe inefficiency.”
— LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval“However, it retains all generated KV caches, resulting in substantial memory consumption, and its retrieval strategy requires further optimization.”
— StreamKV: Streaming Video Question-Answering with Segment-based KV Cache Retrieval and Compression“The offloading of KV cache could incur a lot of memory and is not scalable to ultra-long videos.”
— StreamMem: Query-Agnostic KV Cache Memory for Streaming Video Understanding
Beaten on benchmarks
Head-to-head results where a newer method reports beating ReKV. Values are copied from the source paper's tables — verify against the cited paper.
- MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering
MuKV (Ours) beats ReKV · EgoSchema [LLaVA-OV-7B]
63.3 vs 60.7
- MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering
MuKV (Ours) beats ReKV · VideoMME All [LLaVA-OV-7B]
61.2 vs 59.3
- MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering
MuKV (Ours) beats ReKV · MLVU [LLaVA-OV-0.5B]
55.2 vs 53.2
- MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering
MuKV (Ours) beats ReKV · EgoSchema [LLaVA-OV-0.5B]
30.5 vs 29.6
- MuKV: Multi-Grained KV Cache Compression for Long Streaming Video Question-Answering
MuKV beats ReKV · Acc@Ego [ReKV vs MuKV (2/3 compress ratio)]
57.3 vs 51.5
- Decouple and Cache: KV Cache Construction for Streaming Video Understanding
DSCache beats ReKV · StreamingBench (Real-time) [LLaVA-OV-7B]
79.12 vs 71.06
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 21, 2026
- KVCapsuleKVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric RedundancyMay 14, 2026
- Decoupled Streaming Cache (DSCache)Decouple and Cache: KV Cache Construction for Streaming Video UnderstandingMay 3, 2026
- May 1, 2026
- Hierarchical Adaptive Eviction (HAE)Hierarchical Adaptive Eviction for KV Cache Management in Multimodal Language ModelsFeb 2, 2026
- Dec 13, 2025
- StreamKVStreamKV: Streaming Video Question-Answering with Segment-based KV Cache Retrieval and CompressionNov 10, 2025