Is Longformer superseded?

Q: Is Longformer superseded?

Longformer (Long-context / context-window extension): superseded — cited as a baseline and beaten by newer methods. 0 paper(s) critique it, 1 beat it on benchmarks — #44 of 53 most-superseded. Sub-problem: cluster led by Ring Attention.

Method Drift›Long-context / context-window extension

Superseded baseline#44 of 53 most-superseded

Longformer

Longformer: The Long-Document Transformer

Long-context / context-window extension · first seen Apr 10, 2020

superseded — cited as a baseline and beaten by newer methods

0 papers critique it · 1 beat it on benchmarks

Beaten on benchmarks

Head-to-head results where a newer method reports beating Longformer. Values are copied from the source paper's tables — verify against the cited paper.

PiAttention (Ours) beats Longformer · Training Time [WikiText-103]
12.4 vs 14.2
$π$-Attention: Periodic Sparse Transformers for Efficient Long-Context Modeling
PiAttention (Ours) beats Longformer · Inference Time [WikiText-103]
36.7 vs 40.1
$π$-Attention: Periodic Sparse Transformers for Efficient Long-Context Modeling
PiAttention (Ours) beats Longformer · MFU [WikiText-103]
55.4 vs 51.2
$π$-Attention: Periodic Sparse Transformers for Efficient Long-Context Modeling