CLAISep 12, 2023

Stochastic LLMs do not Understand Language: Towards Symbolic, Explainable and Ontologically Based LLMs

arXiv:2309.05918v333 citationsh-index: 7
Originality Synthesis-oriented
AI Analysis

This addresses the problem of unreliable and opaque language models for users needing factual accuracy and interpretability in AI systems, though it is incremental as it builds on existing symbolic methods.

The paper argues that current stochastic large language models (LLMs) are flawed because they cannot reliably provide factual information, lack explainability due to their subsymbolic nature, and fail in specific linguistic contexts. It proposes developing symbolic, explainable, and ontologically grounded language models as an alternative approach.

In our opinion the exuberance surrounding the relative success of data-driven large language models (LLMs) is slightly misguided and for several reasons (i) LLMs cannot be relied upon for factual information since for LLMs all ingested text (factual or non-factual) was created equal; (ii) due to their subsymbolic na-ture, whatever 'knowledge' these models acquire about language will always be buried in billions of microfeatures (weights), none of which is meaningful on its own; and (iii) LLMs will often fail to make the correct inferences in several linguistic contexts (e.g., nominal compounds, copredication, quantifier scope ambi-guities, intensional contexts. Since we believe the relative success of data-driven large language models (LLMs) is not a reflection on the symbolic vs. subsymbol-ic debate but a reflection on applying the successful strategy of a bottom-up reverse engineering of language at scale, we suggest in this paper applying the effective bottom-up strategy in a symbolic setting resulting in symbolic, explainable, and ontologically grounded language models.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes