CY CLDec 5, 2024

How Large Language Models (LLMs) Extrapolate: From Guided Missiles to Guided Prompts

arXiv:2501.10361v1

Originality Synthesis-oriented

AI Analysis

It reframes the problem of LLM hallucinations for AI researchers by connecting them to historical extrapolation concepts, offering a philosophical rather than technical perspective.

This paper argues that large language models (LLMs) should be viewed as extrapolation machines, linking their successes and hallucinations to this statistical function, and traces its historical roots from early cybernetics to modern AI debates.

This paper argues that we should perceive LLMs as machines of extrapolation. Extrapolation is a statistical function for predicting the next value in a series. Extrapolation contributes to both GPT successes and controversies surrounding its hallucination. The term hallucination implies a malfunction, yet this paper contends that it in fact indicates the chatbot efficiency in extrapolation, albeit an excess of it. This article bears a historical dimension: it traces extrapolation to the nascent years of cybernetics. In 1941, when Norbert Wiener transitioned from missile science to communication engineering, the pivotal concept he adopted was none other than extrapolation. Soviet mathematician Andrey Kolmogorov, renowned for his compression logic that inspired OpenAI, had developed in 1939 another extrapolation project that Wiener later found rather like his own. This paper uncovers the connections between hot war science, Cold War cybernetics, and the contemporary debates on LLM performances.

View on arXiv PDF

Similar