Semantic Spaces
This work addresses the problem of integrating mathematical models with human cognitive processes in linguistics, offering a foundational approach that could impact both computational and theoretical fields.
The paper tackles the challenge of bridging computational and human linguistic representations by exploring geometric constructions, such as Grassmannians and projective spaces, to model semantics in natural language processing, resulting in a framework that interprets latent semantics as a geometric flow and formulates conceptual meetings in geometric terms.
Any natural language can be considered as a tool for producing large databases (consisting of texts, written, or discursive). This tool for its description in turn requires other large databases (dictionaries, grammars etc.). Nowadays, the notion of database is associated with computer processing and computer memory. However, a natural language resides also in human brains and functions in human communication, from interpersonal to intergenerational one. We discuss in this survey/research paper mathematical, in particular geometric, constructions, which help to bridge these two worlds. In particular, in this paper we consider the Vector Space Model of semantics based on frequency matrices, as used in Natural Language Processing. We investigate underlying geometries, formulated in terms of Grassmannians, projective spaces, and flag varieties. We formulate the relation between vector space models and semantic spaces based on semic axes in terms of projectability of subvarieties in Grassmannians and projective spaces. We interpret Latent Semantics as a geometric flow on Grassmannians. We also discuss how to formulate Gärdenfors' notion of "meeting of minds" in our geometric setting.