CL AISep 24, 2020

Toward a Thermodynamics of Meaning

arXiv:2009.11963v11 citations

Originality Incremental advance

AI Analysis

This addresses a foundational question in AI about the capabilities of language models, offering a theoretical framework that could influence model design and interpretation.

The paper tackles the problem of whether text-based language models can learn semantic information about the world, arguing that even simple models learn structural facts while proposing limits on this learning, with a focus on explaining the success of cooccurrence prediction in AI.

As language models such as GPT-3 become increasingly successful at generating realistic text, questions about what purely text-based modeling can learn about the world have become more urgent. Is text purely syntactic, as skeptics argue? Or does it in fact contain some semantic information that a sufficiently sophisticated language model could use to learn about the world without any additional inputs? This paper describes a new model that suggests some qualified answers to those questions. By theorizing the relationship between text and the world it describes as an equilibrium relationship between a thermodynamic system and a much larger reservoir, this paper argues that even very simple language models do learn structural facts about the world, while also proposing relatively precise limits on the nature and extent of those facts. This perspective promises not only to answer questions about what language models actually learn, but also to explain the consistent and surprising success of cooccurrence prediction as a meaning-making strategy in AI.

View on arXiv PDF

Similar