LGAICLJun 8, 2021

A Survey of Transformers

arXiv:2106.04554v21494 citations
AI Analysis

It addresses the need for a structured overview of Transformer developments for researchers and practitioners in AI fields like NLP and computer vision, but it is incremental as it synthesizes existing work without new experimental results.

This survey tackles the lack of a systematic review of Transformer variants (X-formers) by providing a comprehensive literature review, categorizing them based on architectural modification, pre-training, and applications, and outlining future research directions.

Transformers have achieved great success in many artificial intelligence fields, such as natural language processing, computer vision, and audio processing. Therefore, it is natural to attract lots of interest from academic and industry researchers. Up to the present, a great variety of Transformer variants (a.k.a. X-formers) have been proposed, however, a systematic and comprehensive literature review on these Transformer variants is still missing. In this survey, we provide a comprehensive review of various X-formers. We first briefly introduce the vanilla Transformer and then propose a new taxonomy of X-formers. Next, we introduce the various X-formers from three perspectives: architectural modification, pre-training, and applications. Finally, we outline some potential directions for future research.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes