CLMar 5, 2017

Random vector generation of a semantic space

arXiv:1703.02031v1
Originality Synthesis-oriented
AI Analysis

This provides a practical method for natural language processing tasks like synonym generation and text selection, though it appears incremental as it builds on existing vector space and random projection techniques.

The authors tackled the problem of constructing a semantic space from a French synonym dictionary using random vectors and projections, achieving the ability to separate homonyms and generate realistic synonym lists with an extremely fast, real-time updatable process applicable to any language.

We show how random vectors and random projection can be implemented in the usual vector space model to construct a Euclidean semantic space from a French synonym dictionary. We evaluate theoretically the resulting noise and show the experimental distribution of the similarities of terms in a neighborhood according to the choice of parameters. We also show that the Schmidt orthogonalization process is applicable and can be used to separate homonyms with distinct semantic meanings. Neighboring terms are easily arranged into semantically significant clusters which are well suited to the generation of realistic lists of synonyms and to such applications as word selection for automatic text generation. This process, applicable to any language, can easily be extended to collocations, is extremely fast and can be updated in real time, whenever new synonyms are proposed.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes