Is MInference superseded?

MInference (Long-context / context-window extension): superseded — cited as a baseline and beaten by newer methods. 1 paper(s) critique it, 4 beat it on benchmarks — #11 of 53 most-superseded. Sub-problem: cluster led by StreamingLLM. Newer alternatives in the same sub-problem include BA-Att, CSAttention, TCA-Attention, Dynamic Hierarchical Sparse Attention (DHSA).

Method Drift›Long-context / context-window extension

Superseded baseline#11 of 53 most-superseded

MInference

Long-context / context-window extension

superseded — cited as a baseline and beaten by newer methods

1 papers critique it · 4 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites MInference as a baseline.

“both rely on manually designed patterns or rules, which limits their ability to capture highly input-dependent attention sparsity”
— Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs

Beaten on benchmarks

Head-to-head results where a newer method reports beating MInference. Values are copied from the source paper's tables — verify against the cited paper.

Latent-Condensed Transformer beats MInference · Avg. [64K context on H200 GPUs]
29.09 vs 19.71
Latent-Condensed Transformer for Efficient Long Context Modeling
Latent-Condensed Transformer beats MInference · Avg. [128K context on H200 GPUs]
58.80 vs 37.60
Latent-Condensed Transformer for Efficient Long Context Modeling
DHSA beats MInference · Avg. [Llama-3.1-8B-Instruct (4-bit)]
31.8 vs 28.4
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
DHSA beats MInference · 32K [32K tokens]
76.2 vs 74.7
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
DHSA beats MInference · 48K [48K tokens]
71.5 vs 64.2
Long-Context Modeling with Dynamic Hierarchical Sparse Attention for On-Device LLMs
DGA-LLM (Ours) beats MInference · Inter-Token Latency [LongBench-E benchmark]
28.80 vs 94.34
Curse of High Dimensionality Issue in Transformer for Long-context Modeling
DGA-LLM (Ours) beats MInference · Avg. EM Score [EM evaluation at 4K-32K lengths]
27.7 vs 23.7
Curse of High Dimensionality Issue in Transformer for Long-context Modeling
SexyName beats MInference · MMLU [LLaMA3.1-8B-Instruct]
69.21 vs 69.14
Training-free Context-adaptive Attention for Efficient Long Context Modeling

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.