LG CL MLJan 15, 2013

The Expressive Power of Word Embeddings

Yanqing Chen, Bryan Perozzi, Rami Al-Rfou, Steven Skiena

arXiv:1301.3226v4107 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the need for better evaluation metrics for word embeddings in NLP, but it is incremental as it builds on existing embedding methods without introducing new ones.

The paper tackled the problem of understanding quality differences among publicly released word embeddings by proposing tasks to evaluate their characteristics, finding that embeddings capture nuanced semantics without sentence structure and showing significant variance in quality and the impact of dimension count and resolution on useful features.

We seek to better understand the difference in quality of the several publicly released embeddings. We propose several tasks that help to distinguish the characteristics of different embeddings. Our evaluation of sentiment polarity and synonym/antonym relations shows that embeddings are able to capture surprisingly nuanced semantics even in the absence of sentence structure. Moreover, benchmarking the embeddings shows great variance in quality and characteristics of the semantics captured by the tested embeddings. Finally, we show the impact of varying the number of dimensions and the resolution of each dimension on the effective useful features captured by the embedding space. Our contributions highlight the importance of embeddings for NLP tasks and the effect of their quality on the final results.

View on arXiv PDF

Similar