CLMLFeb 5, 2017

All-but-the-Top: Simple and Effective Postprocessing for Word Representations

arXiv:1702.01417v2383 citations
AI Analysis

This incremental improvement addresses the need for more effective word embeddings in NLP applications.

The paper tackles the problem of improving off-the-shelf word representations by introducing a simple postprocessing technique that removes the common mean vector and top dominating directions, resulting in consistently better performance across various lexical and sentence-level tasks in multiple languages.

Real-valued word representations have transformed NLP applications; popular examples are word2vec and GloVe, recognized for their ability to capture linguistic regularities. In this paper, we demonstrate a {\em very simple}, and yet counter-intuitive, postprocessing technique -- eliminate the common mean vector and a few top dominating directions from the word vectors -- that renders off-the-shelf representations {\em even stronger}. The postprocessing is empirically validated on a variety of lexical-level intrinsic tasks (word similarity, concept categorization, word analogy) and sentence-level tasks (semantic textural similarity and { text classification}) on multiple datasets and with a variety of representation methods and hyperparameter choices in multiple languages; in each case, the processed representations are consistently better than the original ones.

Code Implementations4 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes