CLJun 3, 2016

Enhancing the LexVec Distributed Word Representation Model Using Positional Contexts and External Memory

Alexandre Salle, Marco Idiart, Aline Villavicencio

arXiv:1606.01283v16.830 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses efficiency and performance issues in word representation models for NLP applications, but it is incremental.

The paper tackled the LexVec model's shortcomings by using positional contexts to improve syntactic performance and external memory to avoid storing the PPMI matrix, showing effectiveness in word similarity and analogy tasks.

In this paper we take a state-of-the-art model for distributed word representation that explicitly factorizes the positive pointwise mutual information (PPMI) matrix using window sampling and negative sampling and address two of its shortcomings. We improve syntactic performance by using positional contexts, and solve the need to store the PPMI matrix in memory by working on aggregate data in external memory. The effectiveness of both modifications is shown using word similarity and analogy tasks.

View on arXiv PDF Code

Similar