HMT (Long-context / context-window extension): superseded — cited as a baseline and beaten by newer methods. 1 paper(s) critique it, 1 beat it on benchmarks — #30 of 53 most-superseded. Sub-problem: cluster led by Activation Beacon. Newer alternatives in the same sub-problem include SharedLLM, Gradual Forgetting.

Method Drift›Long-context / context-window extension

Superseded baseline#30 of 53 most-superseded

HMT

HMT: Hierarchical Memory Transformer for Efficient Long Context Language Processing

Long-context / context-window extension · first seen May 9, 2024

superseded — cited as a baseline and beaten by newer methods

1 papers critique it · 1 beat it on benchmarks

What papers say

Verbatim critique sentences, each from a paper that cites HMT as a baseline.

“many lack explicit gating mechanisms to protect important long-term memories from being overwritten by newer information”
— CoMeT: Collaborative Memory Transformer for Efficient Long Context Modeling

Beaten on benchmarks

Head-to-head results where a newer method reports beating HMT. Values are copied from the source paper's tables — verify against the cited paper.

CoMeT beats HMT · Avg [SCROLLS benchmark, 3k memory budget]
40.10 vs 30.31
CoMeT: Collaborative Memory Transformer for Efficient Long Context Modeling

Newer alternatives

Recent methods in the same sub-problem, not yet superseded in the knowledge base.