Is WindowQuant superseded?

WindowQuant (KV-cache compression): current frontier — recent, not yet superseded in the knowledge base. 0 paper(s) critique it, 0 beat it on benchmarks — not ranked as a superseded baseline. Sub-problem: cluster led by KIVI. Newer alternatives in the same sub-problem include SpectrumKV, Hurwitz Quaternion Multiplicative Quantization (HQMQ), OSCAR, OScaR, TriAxialKV.

Method Drift›KV-cache compression

Tracked

WindowQuant

WindowQuant: Mixed-Precision KV Cache Quantization based on Window-Level Similarity for VLMs Inference Optimization

KV-cache compression · first seen May 4, 2026

current frontier — recent, not yet superseded in the knowledge base

0 papers critique it · 0 beat it on benchmarks

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.

SpectrumKV SpectrumKV: Per-Token Mixed-Precision KV Cache Transfer for Prefill-Decode Disaggregated LLM Serving
Jun 7, 2026
Hurwitz Quaternion Multiplicative Quantization (HQMQ)Hurwitz Quaternion Multiplicative Quantization for KV Cache Compression
May 26, 2026
OSCAR OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization
May 18, 2026
OScaR OSCAR: Offline Spectral Covariance-Aware Rotation for 2-bit KV Cache Quantization
May 18, 2026
TriAxialKV TriAxialKV: Toward Extreme Low-Precision KV-Cache Quantization for Agentic Inference Tasks
May 16, 2026
KVServe KVServe: Service-Aware KV Cache Compression for Communication-Efficient Disaggregated LLM Serving
May 13, 2026
WindowQuant WindowQuant: Mixed-Precision KV Cache Quantization based on Window-Level Similarity for VLMs Inference Optimization
May 4, 2026
SAW-INT4 SAW-INT4: System-Aware 4-Bit KV-Cache Quantization for Real-World LLM Serving
Apr 21, 2026
eOptShrinkQ eOptShrinkQ: Near-Lossless KV Cache Compression Through Optimal Spectral Denoising and Quantization
Apr 6, 2026
OCTOPUS Octopus: Enhancing CXL Memory Pods via Sparse Topology
Apr 3, 2026
IsoQuant IsoQuant: Hardware-Aligned SO(4) Isoclinic Rotations for LLM KV Cache Compression
Mar 30, 2026
TurboAngle TurboAngle: Near-Lossless KV Cache Compression via Uniform Angle Quantization
Mar 29, 2026