CLMay 8, 2016

Problems With Evaluation of Word Embeddings Using Word Similarity Tasks

Manaal Faruqui, Yulia Tsvetkov, Pushpendre Rastogi, Chris Dyer

arXiv:1605.02276v323.5284 citations

Originality Synthesis-oriented

AI Analysis

This addresses a methodological issue for the NLP community, highlighting the need for better evaluation standards, but it is incremental as it summarizes existing critiques without introducing new methods.

The paper identifies problems with using word similarity tasks to evaluate word embeddings, summarizing existing solutions and concluding that this evaluation approach is unsustainable, calling for further research on evaluation methods.

Lacking standardized extrinsic evaluation methods for vector representations of words, the NLP community has relied heavily on word similarity tasks as a proxy for intrinsic evaluation of word vectors. Word similarity evaluation, which correlates the distance between vectors and human judgments of semantic similarity is attractive, because it is computationally inexpensive and fast. In this paper we present several problems associated with the evaluation of word vectors on word similarity datasets, and summarize existing solutions. Our study suggests that the use of word similarity tasks for evaluation of word vectors is not sustainable and calls for further research on evaluation methods.

View on arXiv PDF

Similar