Method Drift›Long-context / context-window extension
Superseded baseline#36 of 53 most-superseded
RMT
Long-context / context-window extension
superseded — cited as a baseline and beaten by newer methods
1 papers critique it · 1 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites RMT as a baseline.
“many lack explicit gating mechanisms to protect important long-term memories from being overwritten by newer information”
— CoMeT: Collaborative Memory Transformer for Efficient Long Context Modeling
Beaten on benchmarks
Head-to-head results where a newer method reports beating RMT. Values are copied from the source paper's tables — verify against the cited paper.
- Gradual Forgetting: Logarithmic Compression for Extending Transformer Context Windows
Scale-Invariant Compression Filters (Ours) beats RMT · Per-word PPL [WikiText-103]
23.56 vs 24.85
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.