LG AI CL NEJul 19, 2022

Formal Algorithms for Transformers

arXiv:2207.09238v129.9105 citationsh-index: 45

Originality Synthesis-oriented

AI Analysis

This is an incremental work that offers a formal reference for researchers and practitioners in machine learning to understand transformer fundamentals.

The paper provides a mathematically precise overview of transformer architectures and algorithms, detailing their components, training, and applications without presenting new experimental results.

This document aims to be a self-contained, mathematically precise overview of transformer architectures and algorithms (*not* results). It covers what transformers are, how they are trained, what they are used for, their key architectural components, and a preview of the most prominent models. The reader is assumed to be familiar with basic ML terminology and simpler neural network architectures such as MLPs.

View on arXiv PDF

Similar