CLQUANT-PHDec 21, 2024

Quantum-Like Contextuality in Large Language Models

arXiv:2412.16806v14 citationsh-index: 26Has Code
Originality Incremental advance
AI Analysis

This provides the first large-scale evidence of quantum-like contextuality in language, suggesting potential advantages for quantum methods in NLP tasks like co-reference resolution.

The paper investigated whether quantum-like contextuality occurs in natural language by constructing a linguistic schema based on quantum scenarios and using BERT to analyze the Simple English Wikipedia, discovering 77,118 sheaf-contextual and 36,938,948 CbD contextual instances. It proved that contextuality correlates with semantic similarity by linking it to Euclidean distances in BERT embeddings, with regression showing distance as the best predictor.

Contextuality is a distinguishing feature of quantum mechanics and there is growing evidence that it is a necessary condition for quantum advantage. In order to make use of it, researchers have been asking whether similar phenomena arise in other domains. The answer has been yes, e.g. in behavioural sciences. However, one has to move to frameworks that take some degree of signalling into account. Two such frameworks exist: (1) a signalling-corrected sheaf theoretic model, and (2) the Contextuality-by-Default (CbD) framework. This paper provides the first large scale experimental evidence for a yes answer in natural language. We construct a linguistic schema modelled over a contextual quantum scenario, instantiate it in the Simple English Wikipedia and extract probability distributions for the instances using the large language model BERT. This led to the discovery of 77,118 sheaf-contextual and 36,938,948 CbD contextual instances. We proved that the contextual instances came from semantically similar words, by deriving an equation between degrees of contextuality and Euclidean distances of BERT's embedding vectors. A regression model further reveals that Euclidean distance is indeed the best statistical predictor of contextuality. Our linguistic schema is a variant of the co-reference resolution challenge. These results are an indication that quantum methods may be advantageous in language tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes