CL LGOct 3, 2019

Complex networks based word embeddings

arXiv:1910.01489v10.2

Originality Incremental advance

AI Analysis

This work addresses word embedding generation for natural language processing, but it is incremental as it adapts existing community detection methods to a known bottleneck.

The paper tackled the problem of learning word embeddings by applying community detection from complex networks theory to reduce the dimensionality of word co-occurrence matrices, resulting in embeddings comparable to state-of-the-art approaches.

Most of the time, the first step to learn word embeddings is to build a word co-occurrence matrix. As such matrices are equivalent to graphs, complex networks theory can naturally be used to deal with such data. In this paper, we consider applying community detection, a main tool of this field, to the co-occurrence matrix corresponding to a huge corpus. Community structure is used as a way to reduce the dimensionality of the initial space. Using this community structure, we propose a method to extract word embeddings that are comparable to the state-of-the-art approaches.

View on arXiv PDF

Similar