Method Drift›KV-cache compression
FastV
KV-cache compression
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 2 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites FastV as a baseline.
“However, most existing methods depend on query-derived text tokens to compute attention scores and initiate compression, inevitably causing response delays in online scenarios.”
— LiveVLM: Efficient Online Video Understanding via Streaming-Oriented KV Cache and Retrieval“FastV, though training-free, prunes vision tokens without cross-modality guidance, yielding inconsistent results across models and benchmarks.”
— Make Your LVLM KV Cache More Lightweight
Beaten on benchmarks
Head-to-head results where a newer method reports beating FastV. Values are copied from the source paper's tables — verify against the cited paper.
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · MME [Qwen2]
1634.22 vs 1498.46
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · LLaVABench [Qwen2]
74.80 vs 55.50
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · HallusionBench [Qwen2]
60.99 vs 58.68
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · LLaVABench [Qwen2.5]
83.70 vs 12.90
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · HallusionBench [Qwen2.5]
62.20 vs 25.60
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · MME [Qwen3]
1708.49 vs 561.25
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · MMMU [Qwen3]
0.48 vs 0.35
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · COCO [Qwen3]
15.80 vs 8.01
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · LLaVABench [Qwen3]
73.80 vs 22.30
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · MME [LLaVA-Mistral]
1458.10 vs 1440.69
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · MME [LLaVA-Llama]
1533.40 vs 1525.65
- KVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric Redundancy
KVCapsule beats FastV · MMMU [LLaVA-Llama]
0.60 vs 0.46
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 21, 2026
- KVCapsuleKVCapsule: Efficient Sequential KV Cache Compression for Vision-Language Models with Asymmetric RedundancyMay 14, 2026
- Decoupled Streaming Cache (DSCache)Decouple and Cache: KV Cache Construction for Streaming Video UnderstandingMay 3, 2026
- May 1, 2026
- Hierarchical Adaptive Eviction (HAE)Hierarchical Adaptive Eviction for KV Cache Management in Multimodal Language ModelsFeb 2, 2026
- Dec 13, 2025
- StreamKVStreamKV: Streaming Video Question-Answering with Segment-based KV Cache Retrieval and CompressionNov 10, 2025