CL AI LG MLJun 7, 2018

Probabilistic FastText for Multi-Sense Word Embeddings

Ben Athiwaratkun, Andrew Gordon Wilson, Anima Anandkumar

arXiv:1806.02901v132.51136 citationsHas Code

Originality Incremental advance

AI Analysis

This work addresses the need for more accurate and robust word representations in natural language processing, particularly for rare words and multiple meanings, though it is incremental in building upon existing FastText and probabilistic embedding methods.

The authors tackled the problem of creating word embeddings that capture multiple senses and handle rare or unseen words by introducing Probabilistic FastText, which uses Gaussian mixture densities with n-gram means to outperform FastText and dictionary-level probabilistic embeddings on word-similarity benchmarks like English RareWord and foreign language datasets.

We introduce Probabilistic FastText, a new model for word embeddings that can capture multiple word senses, sub-word structure, and uncertainty information. In particular, we represent each word with a Gaussian mixture density, where the mean of a mixture component is given by the sum of n-grams. This representation allows the model to share statistical strength across sub-word structures (e.g. Latin roots), producing accurate representations of rare, misspelt, or even unseen words. Moreover, each component of the mixture can capture a different word sense. Probabilistic FastText outperforms both FastText, which has no probabilistic model, and dictionary-level probabilistic embeddings, which do not incorporate subword structures, on several word-similarity benchmarks, including English RareWord and foreign language datasets. We also achieve state-of-art performance on benchmarks that measure ability to discern different meanings. Thus, the proposed model is the first to achieve multi-sense representations while having enriched semantics on rare words.

View on arXiv PDF Code

Similar