CLOct 24, 2020

Word2vec Conjecture and A Limitative Result

arXiv:2010.12719v11 citations
Originality Incremental advance
AI Analysis

This addresses a foundational representational question in natural language processing, showing limitations for vector-based semantic models, and is incremental as it builds on prior work on word2vec.

The paper tackled the problem of whether all semantic word-word relations can be represented by vector differences, known as the word2vec conjecture, and found that a class of relations cannot be represented this way, establishing a limitative result for vector spaces over fields like real numbers.

Being inspired by the success of \texttt{word2vec} \citep{mikolov2013distributed} in capturing analogies, we study the conjecture that analogical relations can be represented by vector spaces. Unlike many previous works that focus on the distributional semantic aspect of \texttt{word2vec}, we study the purely \emph{representational} question: can \emph{all} semantic word-word relations be represented by differences (or directions) of vectors? We call this the word2vec conjecture and point out some of its desirable implications. However, we will exhibit a class of relations that cannot be represented in this way, thus falsifying the conjecture and establishing a limitative result for the representability of semantic relations by vector spaces over fields of characteristic 0, e.g., real or complex numbers.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes