CLQUANT-PHFeb 7, 2024

Developments in Sheaf-Theoretic Models of Natural Language Ambiguities

arXiv:2402.04505v22 citationsh-index: 26DCM
Originality Incremental advance
AI Analysis

This work addresses natural language processing challenges related to anaphoric ambiguities, representing an incremental extension of existing sheaf-theoretic models.

The paper tackled the problem of modeling discourse ambiguities from anaphora in natural language by extending sheaf-theoretic models from lexical to discourse levels, resulting in a higher proportion of contextual models (82.9%) compared to previous work (3.17%) and modeling a Winograd Schema challenge with a contextual fraction of 0.096.

Sheaves are mathematical objects consisting of a base which constitutes a topological space and the data associated with each open set thereof, e.g. continuous functions defined on the open sets. Sheaves have originally been used in algebraic topology and logic. Recently, they have also modelled events such as physical experiments and natural language disambiguation processes. We extend the latter models from lexical ambiguities to discourse ambiguities arising from anaphora. To begin, we calculated a new measure of contextuality for a dataset of basic anaphoric discourses, resulting in a higher proportion of contextual models-82.9%-compared to previous work which only yielded 3.17% contextual models. Then, we show how an extension of the natural language processing challenge, known as the Winograd Schema, which involves anaphoric ambiguities can be modelled on the Bell-CHSH scenario with a contextual fraction of 0.096.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes