CLLGOct 9, 2019

Word Embedding Visualization Via Dictionary Learning

arXiv:1910.03833v220 citations
Originality Incremental advance
AI Analysis

This provides a visualization tool for interpreting word embeddings, which is useful for NLP researchers, but it is incremental as it builds on existing embedding methods.

The authors tackled the problem of understanding word embeddings by applying dictionary learning to decompose word vectors into interpretable factors, discovering that many factors align with semantic and syntactic meanings previously identified manually. They demonstrated that these factors improve state-of-the-art word embedding techniques in analogy tasks by a large margin.

Co-occurrence statistics based word embedding techniques have proved to be very useful in extracting the semantic and syntactic representation of words as low dimensional continuous vectors. In this work, we discovered that dictionary learning can open up these word vectors as a linear combination of more elementary word factors. We demonstrate many of the learned factors have surprisingly strong semantic or syntactic meaning corresponding to the factors previously identified manually by human inspection. Thus dictionary learning provides a powerful visualization tool for understanding word embedding representations. Furthermore, we show that the word factors can help in identifying key semantic and syntactic differences in word analogy tasks and improve upon the state-of-the-art word embedding techniques in these tasks by a large margin.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes