MLCLLGJun 8, 2017

Context encoders as a simple but powerful extension of word2vec

arXiv:1706.02496v11091 citations
AI Analysis

This work addresses the problem of improving word representation for natural language processing tasks, particularly for polysemy and out-of-vocabulary words, offering a simple extension to a widely used model.

The paper tackled the limitations of word2vec in representing polysemous words and handling out-of-vocabulary words by proposing context encoders (ConEc), which extend word2vec to generate embeddings based on local contexts, and demonstrated its effectiveness by applying these embeddings to the CoNLL 2003 named entity recognition task.

With a simple architecture and the ability to learn meaningful word embeddings efficiently from texts containing billions of words, word2vec remains one of the most popular neural language models used today. However, as only a single embedding is learned for every word in the vocabulary, the model fails to optimally represent words with multiple meanings. Additionally, it is not possible to create embeddings for new (out-of-vocabulary) words on the spot. Based on an intuitive interpretation of the continuous bag-of-words (CBOW) word2vec model's negative sampling training objective in terms of predicting context based similarities, we motivate an extension of the model we call context encoders (ConEc). By multiplying the matrix of trained word2vec embeddings with a word's average context vector, out-of-vocabulary (OOV) embeddings and representations for a word with multiple meanings can be created based on the word's local contexts. The benefits of this approach are illustrated by using these word embeddings as features in the CoNLL 2003 named entity recognition (NER) task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes