CLSep 26, 2022

Entailment Semantics Can Be Extracted from an Ideal Language Model

arXiv:2209.12407v3293 citationsh-index: 40
Originality Incremental advance
AI Analysis

This provides a theoretical pathway for understanding semantic information in unlabeled data, potentially aiding linguists and AI researchers, but it is incremental as it builds on existing linguistic theories.

The paper tackles the problem of whether natural language semantics can be inferred from text alone by proving that entailment judgments between sentences can be extracted from an ideal language model trained on data generated by Gricean agents, and shows these judgments can be decoded from such a model's predictions.

Language models are often trained on text alone, without additional grounding. There is debate as to how much of natural language semantics can be inferred from such a procedure. We prove that entailment judgments between sentences can be extracted from an ideal language model that has perfectly learned its target distribution, assuming the training sentences are generated by Gricean agents, i.e., agents who follow fundamental principles of communication from the linguistic theory of pragmatics. We also show entailment judgments can be decoded from the predictions of a language model trained on such Gricean data. Our results reveal a pathway for understanding the semantic information encoded in unlabeled linguistic data and a potential framework for extracting semantics from language models.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes