CL AIAug 20, 2016

Learning Word Embeddings from Intrinsic and Extrinsic Views

Jifan Chen, Kan Chen, Xipeng Qiu, Qi Zhang, Xuanjing Huang, Zheng Zhang

arXiv:1608.05852v10.8

Originality Incremental advance

AI Analysis

This work addresses the problem of improving word embeddings for natural language processing, particularly for rare words and words not well-captured by context, though it appears incremental as it builds on existing embedding methods.

The paper tackled the limitations of context-based word embeddings by integrating intrinsic and extrinsic information, resulting in a model that showed effectiveness across word similarity, reverse dictionaries, Wiki link prediction, and document classification tasks.

While word embeddings are currently predominant for natural language processing, most of existing models learn them solely from their contexts. However, these context-based word embeddings are limited since not all words' meaning can be learned based on only context. Moreover, it is also difficult to learn the representation of the rare words due to data sparsity problem. In this work, we address these issues by learning the representations of words by integrating their intrinsic (descriptive) and extrinsic (contextual) information. To prove the effectiveness of our model, we evaluate it on four tasks, including word similarity, reverse dictionaries,Wiki link prediction, and document classification. Experiment results show that our model is powerful in both word and document modeling.

View on arXiv PDF

Similar