Method Drift›Long-context / context-window extension
CEPE
Long-context / context-window extension
superseded — cited as a baseline and beaten by newer methods
2 papers critique it · 2 beat it on benchmarks
What papers say
Verbatim critique sentences, each from a paper that cites CEPE as a baseline.
“CEPE can process past context chunks in parallel, but these chunks must be passed through all its encoder layers (24-layer RoBERTa in CEPE) and layer-wise linear projections to obtain the final hidden states for cross-attention, leading to even slower inference speed than non-parallel Activation Beacon.”
— Stacked from One: Multi-Scale Self-Injection for Context Window Extension“However, this heterogeneous architecture necessitates meticulous task design for the extra pretraining and warmup stages to stabilize the fine-tuning process.”
— Two are better than one: Context window extension with multi-grained self-injection
Beaten on benchmarks
Head-to-head results where a newer method reports beating CEPE. Values are copied from the source paper's tables — verify against the cited paper.
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [Arxiv 4K]
2.99 vs 3.03
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [Arxiv 8K]
2.97 vs 3.02
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [Arxiv 32K]
2.46 vs 2.51
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [Arxiv 128K]
2.91 vs 2.97
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [PG19 4K]
6.55 vs 6.69
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [PG19 8K]
6.28 vs 6.40
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [PG19 32K]
6.65 vs 6.80
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [PG19 128K]
5.96 vs 6.10
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [ProofPile 4K]
2.33 vs 2.38
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [ProofPile 8K]
2.34 vs 2.43
- Stacked from One: Multi-Scale Self-Injection for Context Window Extension
SharedLLM beats CEPE · Perplexity [ProofPile 32K]
2.38 vs 2.45
Newer alternatives
Recent methods in the same sub-problem, not yet superseded in the knowledge base.