CLAIITMLMar 23, 2023

A Simple Explanation for the Phase Transition in Large Language Models with List Decoding

arXiv:2303.13112v14 citationsh-index: 36
Originality Incremental advance
AI Analysis

This provides a theoretical explanation for the phase transition phenomenon in LLMs, which is incremental as it builds on existing observations of emergent abilities.

The paper tackles the problem of explaining emergent abilities in large language models (LLMs) by modeling them as sequence-to-sequence random functions with list decoding, showing that there is a critical threshold where the expected number of erroneous candidate sequences remains bounded below it and grows exponentially above it.

Various recent experimental results show that large language models (LLM) exhibit emergent abilities that are not present in small models. System performance is greatly improved after passing a certain critical threshold of scale. In this letter, we provide a simple explanation for such a phase transition phenomenon. For this, we model an LLM as a sequence-to-sequence random function. Instead of using instant generation at each step, we use a list decoder that keeps a list of candidate sequences at each step and defers the generation of the output sequence at the end. We show that there is a critical threshold such that the expected number of erroneous candidate sequences remains bounded when an LLM is below the threshold, and it grows exponentially when an LLM is above the threshold. Such a threshold is related to the basic reproduction number in a contagious disease.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes