CLAIMar 17, 2025

A Survey on Transformer Context Extension: Approaches and Evaluation

arXiv:2503.13299v215 citationsh-index: 9
Originality Synthesis-oriented
AI Analysis

This is an incremental survey paper that organizes and synthesizes existing research on long context processing for LLMs, primarily benefiting researchers in natural language processing.

This survey paper tackles the problem of Transformer-based large language models (LLMs) degrading in performance when handling long contexts, by systematically reviewing existing approaches for context extension and organizing evaluation benchmarks. The result is a comprehensive taxonomy categorizing methods into four types and highlighting unresolved issues in the field.

Large language models (LLMs) based on Transformer have been widely applied in the filed of natural language processing (NLP), demonstrating strong performance, particularly in handling short text tasks. However, when it comes to long context scenarios, the performance of LLMs degrades due to some challenges. To alleviate this phenomenon, there is a number of work proposed recently. In this survey, we first list the challenges of applying pre-trained LLMs to process long contexts. Then systematically review the approaches related to long context and propose our taxonomy categorizing them into four main types: positional encoding, context compression, retrieval augmented, and attention pattern. In addition to the approaches, we focus on the evaluation of long context, organizing relevant data, tasks, and metrics based on existing long context benchmarks. Finally, we summarize unresolved issues in the long context domain and put forward our views on future developments.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes