CLNov 15, 2014

Retrofitting Word Vectors to Semantic Lexicons

Manaal Faruqui, Jesse Dodge, Sujay K. Jauhar, Chris Dyer, Eduard Hovy, Noah A. Smith

arXiv:1411.4166v4234 citations

Originality Incremental advance

AI Analysis

This work addresses the limitation of word vectors for natural language processing tasks by integrating semantic lexicons, offering an incremental improvement over prior techniques.

The paper tackled the problem of word vector representations lacking semantic lexicon information by proposing a method to refine vectors using relational data from lexicons like WordNet, resulting in substantial improvements on standard lexical semantic evaluation tasks across multiple languages.

Vector space word representations are learned from distributional information of words in large corpora. Although such statistics are semantically informative, they disregard the valuable information that is contained in semantic lexicons such as WordNet, FrameNet, and the Paraphrase Database. This paper proposes a method for refining vector space representations using relational information from semantic lexicons by encouraging linked words to have similar vector representations, and it makes no assumptions about how the input vectors were constructed. Evaluated on a battery of standard lexical semantic evaluation tasks in several languages, we obtain substantial improvements starting with a variety of word vector models. Our refinement method outperforms prior techniques for incorporating semantic lexicons into the word vector training algorithms.

View on arXiv PDF

Similar