AICLLGAug 9, 2023

On the Unexpected Abilities of Large Language Models

arXiv:2308.09720v215 citationsh-index: 52
Originality Synthesis-oriented
AI Analysis

This work addresses the fundamental problem of understanding emergent abilities in AI for researchers and practitioners, but it is incremental as it synthesizes existing research without presenting new empirical results.

The paper reviews how large language models acquire diverse cognitive abilities beyond their training objective of next-word prediction, exploring the indirect processes and factors enabling this development and questioning the predictability of their full capabilities.

Large Language Models (LLMs) are capable of displaying a wide range of abilities that are not directly connected with the task for which they are trained: predicting the next words of human-written texts. In this article, I review recent research investigating the cognitive abilities developed by LLMs and their relation to human cognition. I discuss the nature of the indirect process that leads to the acquisition of these cognitive abilities, their relation to other indirect processes, and the implications for the acquisition of integrated abilities. Moreover, I propose the factors that enable the development of abilities that are related only very indirectly to the proximal objective of the training task. Finally, I discuss whether the full set of capabilities that LLMs could possibly develop is predictable.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes