Is Lexico superseded?

Lexico (KV-cache compression): superseded — cited as a baseline and beaten by newer methods. 3 paper(s) critique it, 0 beat it on benchmarks — #62 of 234 most-superseded. Sub-problem: cluster led by Palu. Newer alternatives in the same sub-problem include ArborKV, RDKV, EchoKV, VQKV, Self-Indexing KVCache.

Method Drift›KV-cache compression

Superseded baseline#62 of 234 most-superseded

Lexico

Lexico: Extreme KV Cache Compression via Sparse Coding over Universal Dictionaries

KV-cache compression · first seen Dec 12, 2024

superseded — cited as a baseline and beaten by newer methods

3 papers critique it · 0 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites Lexico as a baseline.

“Frameworks such as Lexico kim2024lexicoextremekvcache introduce significant latency by relying on separate compression and decompression steps at every single decoding stage.”
— SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression
“unlike Lexico's uniform compression, we leverage the Semantic Elbow and Key-Value Asymmetry to dynamically allocate budgets---heavily compressing sparse routing information while preserving dense semantic content”
— Unlocking the Address Book: Dissecting the Sparse Semantic Structure of LLM Key-Value Caches via Sparse Autoencoders
“Unfortunately, this approach requires solving a computationally expensive matching pursuit algorithm for each key and value embedding, making Lexico relatively slow.”
— PolarQuant: Quantizing KV Caches with Polar Transformation

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.