CLFeb 28, 2023

A Survey on Long Text Modeling with Transformers

arXiv:2302.14502v273 citationsh-index: 70
Originality Synthesis-oriented
AI Analysis

It synthesizes existing research for NLP researchers working on long document processing, but is incremental as it is a survey paper.

This paper surveys recent advances in modeling long texts using Transformers, addressing challenges like length limitations and complex semantics, and provides an overview of methods and applications without presenting new experimental results.

Modeling long texts has been an essential technique in the field of natural language processing (NLP). With the ever-growing number of long documents, it is important to develop effective modeling methods that can process and analyze such texts. However, long texts pose important research challenges for existing text models, with more complex semantics and special characteristics. In this paper, we provide an overview of the recent advances on long texts modeling based on Transformer models. Firstly, we introduce the formal definition of long text modeling. Then, as the core content, we discuss how to process long input to satisfy the length limitation and design improved Transformer architectures to effectively extend the maximum context length. Following this, we discuss how to adapt Transformer models to capture the special characteristics of long texts. Finally, we describe four typical applications involving long text modeling and conclude this paper with a discussion of future directions. Our survey intends to provide researchers with a synthesis and pointer to related work on long text modeling.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes