CLSep 5, 2015

Take and Took, Gaggle and Goose, Book and Read: Evaluating the Utility of Vector Differences for Lexical Relation Learning

Ekaterina Vylomova, Laura Rimell, Trevor Cohn, Timothy Baldwin

arXiv:1509.01692v421.9157 citationsHas Code

Originality Incremental advance

AI Analysis

This addresses the problem of assessing the generality of vector differences for lexical relation learning in NLP, but it is incremental as it builds on prior work with broader evaluations.

The paper evaluated the utility of vector differences from pre-trained word embeddings for learning lexical relations, finding that with supervised training, vector subtraction generalizes well to a broad range of relations, including over unseen lexical items.

Recent work on word embeddings has shown that simple vector subtraction over pre-trained embeddings is surprisingly effective at capturing different lexical relations, despite lacking explicit supervision. Prior work has evaluated this intriguing result using a word analogy prediction formulation and hand-selected relations, but the generality of the finding over a broader range of lexical relation types and different learning settings has not been evaluated. In this paper, we carry out such an evaluation in two learning settings: (1) spectral clustering to induce word relations, and (2) supervised learning to classify vector differences into relation types. We find that word embeddings capture a surprising amount of information, and that, under suitable supervised training, vector subtraction generalises well to a broad range of relations, including over unseen lexical items.

View on arXiv PDF Code

Similar