CLMar 9, 2020

Sentence Analogies: Exploring Linguistic Relationships and Regularities in Sentence Embeddings

arXiv:2003.04036v11.312 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the lack of understanding about sentence embedding properties for NLP researchers, but it is incremental as it extends word analogy evaluation to sentences without introducing a new method.

The paper investigates whether sentence vector representations exhibit regularities similar to word analogies, proposing evaluation schemes based on lexical and semantic relationships and testing various embedding methods including BERT-style models, finding significant differences in model performance.

While important properties of word vector representations have been studied extensively, far less is known about the properties of sentence vector representations. Word vectors are often evaluated by assessing to what degree they exhibit regularities with regard to relationships of the sort considered in word analogies. In this paper, we investigate to what extent commonly used sentence vector representation spaces as well reflect certain kinds of regularities. We propose a number of schemes to induce evaluation data, based on lexical analogy data as well as semantic relationships between sentences. Our experiments consider a wide range of sentence embedding methods, including ones based on BERT-style contextual embeddings. We find that different models differ substantially in their ability to reflect such regularities.

View on arXiv PDF

Similar