Method Drift›KV-cache compression
Superseded baseline#56 of 234 most-superseded
KVFlow
KV-cache compression
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites KVFlow as a baseline.
“This assumption fails on realistic dynamic workloads, where workflows are runtime-dependent.”
— Efficient Serving for Dynamic Agent Workflows with Prediction-based KV-Cache Management“KVFlow also employs a proactive prefetching mechanism, but it is designed for agentic workflows with known prefix patterns. In contrast, ShadowServe supports general serving without this prior knowledge.”
— ShadowServe: Interference-Free KV Cache Fetching for Distributed Prefix Caching
Beaten on benchmarks
Head-to-head results where a newer method reports beating KVFlow. Values are copied from the source paper's tables — verify against the cited paper.
- Efficient Serving for Dynamic Agent Workflows with Prediction-based KV-Cache Management
Full PBKV beats KVFlow · Latency (s) [FinanceBench + CrewAI (static), Qwen3-32B]
80.53 vs 101.57
- Efficient Serving for Dynamic Agent Workflows with Prediction-based KV-Cache Management
Full PBKV beats KVFlow · Cache Hit Rate (%) [FinanceBench + CrewAI (static), Qwen3-32B]
53.44 vs 39.87
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.
- May 7, 2026
- Sep 21, 2025