CLNov 15, 2014

Retrofitting Word Vectors to Semantic Lexicons

arXiv:1411.4166v4234 citations
Originality Incremental advance
AI Analysis

This work addresses the limitation of word vectors for natural language processing tasks by integrating semantic lexicons, offering an incremental improvement over prior techniques.

The paper tackled the problem of word vector representations lacking semantic lexicon information by proposing a method to refine vectors using relational data from lexicons like WordNet, resulting in substantial improvements on standard lexical semantic evaluation tasks across multiple languages.

Vector space word representations are learned from distributional information of words in large corpora. Although such statistics are semantically informative, they disregard the valuable information that is contained in semantic lexicons such as WordNet, FrameNet, and the Paraphrase Database. This paper proposes a method for refining vector space representations using relational information from semantic lexicons by encouraging linked words to have similar vector representations, and it makes no assumptions about how the input vectors were constructed. Evaluated on a battery of standard lexical semantic evaluation tasks in several languages, we obtain substantial improvements starting with a variety of word vector models. Our refinement method outperforms prior techniques for incorporating semantic lexicons into the word vector training algorithms.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes