CLFeb 17, 2018

Can Network Embedding of Distributional Thesaurus be Combined with Word Vectors for Better Representation?

arXiv:1802.06196v11089 citations
Originality Incremental advance
AI Analysis

This work addresses the challenge of enhancing word representations for natural language processing tasks, offering a novel combination approach that improves performance without relying on handcrafted lexical resources.

The paper tackled the problem of improving word representations by combining network embeddings from a distributional thesaurus with state-of-the-art word vectors, resulting in significant performance gains in NLP tasks such as word similarity, synonym detection, and analogy detection.

Distributed representations of words learned from text have proved to be successful in various natural language processing tasks in recent times. While some methods represent words as vectors computed from text using predictive model (Word2vec) or dense count based model (GloVe), others attempt to represent these in a distributional thesaurus network structure where the neighborhood of a word is a set of words having adequate context overlap. Being motivated by recent surge of research in network embedding techniques (DeepWalk, LINE, node2vec etc.), we turn a distributional thesaurus network into dense word vectors and investigate the usefulness of distributional thesaurus embedding in improving overall word representation. This is the first attempt where we show that combining the proposed word representation obtained by distributional thesaurus embedding with the state-of-the-art word representations helps in improving the performance by a significant margin when evaluated against NLP tasks like word similarity and relatedness, synonym detection, analogy detection. Additionally, we show that even without using any handcrafted lexical resources we can come up with representations having comparable performance in the word similarity and relatedness tasks compared to the representations where a lexical resource has been used.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes