Is STA-Attention superseded?

STA-Attention (KV-cache compression): current frontier — recent, not yet superseded in the knowledge base. 0 paper(s) critique it, 0 beat it on benchmarks — not ranked as a superseded baseline. Sub-problem: cluster led by Palu. Newer alternatives in the same sub-problem include ArborKV, RDKV, EchoKV, VQKV, Self-Indexing KVCache.

Method Drift›KV-cache compression

Tracked

STA-Attention

Unlocking the Address Book: Dissecting the Sparse Semantic Structure of LLM Key-Value Caches via Sparse Autoencoders

KV-cache compression · first seen Dec 11, 2025

current frontier — recent, not yet superseded in the knowledge base

0 papers critique it · 0 beat it on benchmarks

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.

ArborKV ArborKV: Structure-Aware KV Cache Management for Scaling Tree-based LLM Reasoning
May 21, 2026
RDKV RDKV: Rate-Distortion Bit Allocation for Joint Eviction and Quantization of the KV Cache
May 8, 2026
EchoKV EchoKV: Efficient KV Cache Compression via Similarity-Based Reconstruction
Mar 24, 2026
VQKV VQKV: High-Fidelity and High-Ratio Cache Compression via Vector-Quantization
Mar 17, 2026
Self-Indexing KVCache Self-Indexing KVCache: Predicting Sparse Attention from Compressed Keys
Mar 15, 2026
KV-CoRE KV-CoRE: Benchmarking Data-Dependent Low-Rank Compressibility of KV-Caches in LLMs
Feb 5, 2026
StiefAttention Don't be so Stief! Learning KV Cache low-rank approximation over the Stiefel manifold
Jan 29, 2026
GPU-ccelerated INT8 quantization for KV cache compression GPU-Accelerated INT8 Quantization for KV Cache Compression in Large Language Models
Jan 8, 2026
STA-Attention Unlocking the Address Book: Dissecting the Sparse Semantic Structure of LLM Key-Value Caches via Sparse Autoencoders
Dec 11, 2025
SWAN SWAN: Sparse Winnowed Attention for Reduced Inference Memory via Decompression-Free KV-Cache Compression
Nov 24, 2025
SALS SALS: Sparse Attention in Latent Space for KV cache Compression
Oct 28, 2025
OjaKV OjaKV: Context-Aware Online Low-Rank KV Cache Compression with Oja's Rule
Sep 25, 2025