CLApr 25, 2018

Factors Influencing the Surprising Instability of Word Embeddings

arXiv:1804.09692v11130 citations
Originality Incremental advance
AI Analysis

This addresses a critical limitation in natural language processing for researchers and practitioners, as it highlights reliability issues in widely used embedding methods.

The paper investigates the instability of word embeddings, showing that even high-frequency words (100-200 occurrences) can be unstable, and analyzes how various factors affect this stability and its impact on downstream tasks.

Despite the recent popularity of word embedding methods, there is only a small body of work exploring the limitations of these representations. In this paper, we consider one aspect of embedding spaces, namely their stability. We show that even relatively high frequency words (100-200 occurrences) are often unstable. We provide empirical evidence for how various factors contribute to the stability of word embeddings, and we analyze the effects of stability on downstream tasks.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes