LGMLOct 15, 2020

Probabilistic Transformers

arXiv:2010.15583v32 citations
Originality Synthesis-oriented
AI Analysis

This offers a theoretical insight for researchers in machine learning, but it is incremental as it reinterprets an existing method without new empirical results.

The paper demonstrates that Transformers can be interpreted as Maximum Posterior Probability estimators for Mixtures of Gaussian Models, providing a probabilistic perspective and suggesting extensions to other probabilistic scenarios.

We show that Transformers are Maximum Posterior Probability estimators for Mixtures of Gaussian Models. This brings a probabilistic point of view to Transformers and suggests extensions to other probabilistic cases.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes