High-Dimensional Vector Semantics
This work addresses the set membership problem in high-dimensional spaces, with applications in word embeddings, document similarity, and spam filtering, but appears incremental as it builds on known properties of random vectors.
The paper tackles the vector semantics problem by leveraging the 'almost orthogonal' property of high-dimensional random vectors to memorize them through addition, resulting in an efficient probabilistic solution for set membership.
In this paper we explore the "vector semantics" problem from the perspective of "almost orthogonal" property of high-dimensional random vectors. We show that this intriguing property can be used to "memorize" random vectors by simply adding them, and we provide an efficient probabilistic solution to the set membership problem. Also, we discuss several applications to word context vector embeddings, document sentences similarity, and spam filtering.