The Causal Structure of Semantic Ambiguities
This work addresses the challenge of modeling semantic disambiguation for psycholinguistics, but it is incremental as it builds on existing theories and applies them to new data.
The paper tackled the problem of formalizing human disambiguation processes for semantic ambiguities by applying a sheaf-theoretic causal model to a dataset of ambiguous phrases, discovering that disambiguation orders follow subject-to-verb and object-to-verb patterns and that polysemous verbs cause delays compared to homonymous ones.
Ambiguity is a natural language phenomenon occurring at different levels of syntax, semantics, and pragmatics. It is widely studied; in Psycholinguistics, for instance, we have a variety of competing studies for the human disambiguation processes. These studies are empirical and based on eye-tracking measurements. Here we take first steps towards formalizing these processes for semantic ambiguities where we identified the presence of two features: (1) joint plausibility degrees of different possible interpretations, (2) causal structures according to which certain words play a more substantial role in the processes. The novel sheaf-theoretic model of definite causality developed by Gogioso and Pinzani in QPL 2021 offers tools to model and reason about these features. We applied this theory to a dataset of ambiguous phrases extracted from Psycholinguistics literature and their human plausibility judgements collected by us using the Amazon Mechanical Turk engine. We measured the causal fractions of different disambiguation orders within the phrases and discovered two prominent orders: from subject to verb in the subject-verb and from object to verb in the verb object phrases. We also found evidence for delay in the disambiguation of polysemous vs homonymous verbs, again compatible with Psycholinguistic findings.