CLJun 3, 2016

Enhancing the LexVec Distributed Word Representation Model Using Positional Contexts and External Memory

arXiv:1606.01283v130 citations
Originality Synthesis-oriented
AI Analysis

This work addresses efficiency and performance issues in word representation models for NLP applications, but it is incremental.

The paper tackled the LexVec model's shortcomings by using positional contexts to improve syntactic performance and external memory to avoid storing the PPMI matrix, showing effectiveness in word similarity and analogy tasks.

In this paper we take a state-of-the-art model for distributed word representation that explicitly factorizes the positive pointwise mutual information (PPMI) matrix using window sampling and negative sampling and address two of its shortcomings. We improve syntactic performance by using positional contexts, and solve the need to store the PPMI matrix in memory by working on aggregate data in external memory. The effectiveness of both modifications is shown using word similarity and analogy tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes