Cognitive and Cultural Topology of Linguistic Categories:A Semantic-Pragmatic Metric Approach
This work addresses the problem of integrating semantic and pragmatic dimensions for linguists and NLP researchers, offering an incremental improvement by blending insights from related disciplines.
The study tackled the challenge of modeling semantic and pragmatic interactions in NLP by introducing a geometric metric based on word co-occurrence patterns, which mapped cognitive and socio-cultural properties in hyperbolic space, resulting in mappings that surpassed traditional benchmarks and demonstrated significant socio-cultural relevance.
In recent years, the field of NLP has seen growing interest in modeling both semantic and pragmatic dimensions. Despite this progress, two key challenges persist: firstly, the complex task of mapping and analyzing the interactions between semantic and pragmatic features; secondly, the insufficient incorporation of relevant insights from related disciplines outside NLP. Addressing these issues, this study introduces a novel geometric metric that utilizes word co-occurrence patterns. This metric maps two fundamental properties - semantic typicality (cognitive) and pragmatic salience (socio-cultural) - for basic-level categories within a two-dimensional hyperbolic space. Our evaluations reveal that this semantic-pragmatic metric produces mappings for basic-level categories that not only surpass traditional cognitive semantics benchmarks but also demonstrate significant socio-cultural relevance. This finding proposes that basic-level categories, traditionally viewed as semantics-driven cognitive constructs, should be examined through the lens of both semantic and pragmatic dimensions, highlighting their role as a cognitive-cultural interface. The broad contribution of this paper lies in the development of medium-sized, interpretable, and human-centric language embedding models, which can effectively blend semantic and pragmatic dimensions to elucidate both the cognitive and socio-cultural significance of linguistic categories.