Is ShadowKV superseded?

ShadowKV (KV-cache compression): superseded — cited as a baseline and beaten by newer methods. 4 paper(s) critique it, 6 beat it on benchmarks — #14 of 234 most-superseded. Sub-problem: cluster led by Quest. Newer alternatives in the same sub-problem include ParisKV, KVDrive, Louver, IceCache, ScoutAttention.

Method Drift›KV-cache compression

Superseded baseline#14 of 234 most-superseded

ShadowKV

ShadowKV: KV Cache in Shadows for High-Throughput Long-Context LLM Inference

KV-cache compression · first seen Oct 28, 2024

superseded — cited as a baseline and beaten by newer methods

4 papers critique it · 6 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites ShadowKV as a baseline.

“However, limited by the accuracy constraints of intra-layer SVD, ShadowKV is forced to offload Value states to the CPU, leaving inference speed bounded by PCIe bandwidth.”
— xKV: Cross-Layer SVD for KV-Cache Compression
“We systematically evaluate KV offloading methods on context-intensive tasks and observe significant accuracy drops”
— KV Cache Offloading for Context-Intensive Tasks
“ShadowKV does not support long-generation since the SVD is performed only once during prefill, leaving the low-rank key unupdated during decoding.”
— FreeKV: Boosting KV Cache Retrieval for Efficient LLM Inference
“Unfortunately, their coarse-grained retrieval strategies often overlook fine-grained dependencies and incur high I/O overhead.”
— HeteroCache: A Dynamic Retrieval Approach to Heterogeneous KV Cache Compression for Long-Context LLM Inference

Beaten on benchmarks

Head-to-head results where a newer method reports beating ShadowKV. Values are copied from the source paper's tables — verify against the cited paper.

xKSR (Ours) beats ShadowKV · Avg. [Llama-3.1-8B-Instruct, Comp. 1.59 (7.76)]
89.97 vs 88.80
xKV: Cross-Layer SVD for KV-Cache Compression
xKSR (Ours) beats ShadowKV · Avg. [Llama-3.1-8B-Instruct, Comp. 1.63 (8.90)]
89.70 vs 87.17
xKV: Cross-Layer SVD for KV-Cache Compression
xKSR (Ours) beats ShadowKV · Avg. [Llama-3.1-8B-Instruct, Comp. 1.68 (10.45)]
88.34 vs 64.91
xKV: Cross-Layer SVD for KV-Cache Compression
xKVSR (Ours) beats ShadowKV · Avg. [Llama-3.1-8B-Instruct, Comp. 4.37]
89.83 vs 86.32
xKV: Cross-Layer SVD for KV-Cache Compression
xKVSR (Ours) beats ShadowKV · Avg. [Llama-3.1-8B-Instruct, Comp. 5.35]
89.69 vs 70.94
xKV: Cross-Layer SVD for KV-Cache Compression
xKSR beats ShadowKV · Avg. [Llama-3.1-8B-Instruct, Comp. 1.68 (10.45)]
42.50 vs 40.51
xKV: Cross-Layer SVD for KV-Cache Compression
xKVSR beats ShadowKV · Avg. [Llama-3.1-8B-Instruct, Comp. 1.63 (8.90)]
42.69 vs 42.21
xKV: Cross-Layer SVD for KV-Cache Compression
xKVSR beats ShadowKV · Avg. [Llama-3.1-8B-Instruct, Comp. 5.35]
42.40 vs 41.51
xKV: Cross-Layer SVD for KV-Cache Compression
KVDrive beats ShadowKV · Avg (RULER) [Qwen-3-8B]
68.07 vs 67.03
KVDrive: A Holistic Multi-Tier KV Cache Management System for Long-Context LLM Inference
SVDq+Sparsity beats ShadowKV · Average [Qwen2.5-14B-Instruct]
73.1 vs 72.6
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention
SVDq+Sparsity beats ShadowKV · Average [Qwen2.5-7B-Instruct]
66.8 vs 63.6
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention
SVDq+Sparsity beats ShadowKV · Average [Qwen2.5-3B-Instruct]
55.5 vs 51.8
SVDq: 1.25-bit and 410x Key Cache Compression for LLM Attention

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.