MLCLLGNov 17, 2015

Learning the Dimensionality of Word Embeddings

arXiv:1511.05392v34 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of determining optimal embedding dimensions for natural language processing tasks, offering insights into semantic distribution, though it is incremental as it builds on existing word2vec models.

The authors tackled the problem of learning word embeddings with data-dependent dimensionality by introducing nonparametric analogs of word2vec models, showing they are competitive with fixed-dimension counterparts while providing a distribution over embedding dimensionalities.

We describe a method for learning word embeddings with data-dependent dimensionality. Our Stochastic Dimensionality Skip-Gram (SD-SG) and Stochastic Dimensionality Continuous Bag-of-Words (SD-CBOW) are nonparametric analogs of Mikolov et al.'s (2013) well-known 'word2vec' models. Vector dimensionality is made dynamic by employing techniques used by Cote & Larochelle (2016) to define an RBM with an infinite number of hidden units. We show qualitatively and quantitatively that SD-SG and SD-CBOW are competitive with their fixed-dimension counterparts while providing a distribution over embedding dimensionalities, which offers a window into how semantics distribute across dimensions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes