CLDCJun 24, 2016

Efficient Parallel Learning of Word2Vec

arXiv:1606.07822v1
Originality Synthesis-oriented
AI Analysis

This is an incremental improvement for NLP practitioners using Word2Vec to speed up training.

The paper tackled the problem of memory update collisions degrading efficiency in parallel Word2Vec learning, and proposed a caching strategy that improved efficiency by a factor of 4.

Since its introduction, Word2Vec and its variants are widely used to learn semantics-preserving representations of words or entities in an embedding space, which can be used to produce state-of-art results for various Natural Language Processing tasks. Existing implementations aim to learn efficiently by running multiple threads in parallel while operating on a single model in shared memory, ignoring incidental memory update collisions. We show that these collisions can degrade the efficiency of parallel learning, and propose a straightforward caching strategy that improves the efficiency by a factor of 4.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes