CLITLGOct 18, 2022

On the Information Content of Predictions in Word Analogy Tests

arXiv:2210.09972v11 citationsh-index: 9
Originality Incremental advance
AI Analysis

This provides a method to assess the relevance of analogies in NLP tasks, but it is incremental as it builds on existing word embedding models and test sets.

The paper tackled the problem of quantifying the information content of analogies in word analogy tests, finding that proximity hints are more relevant than analogies, with analogies carrying about one bit of information.

An approach is proposed to quantify, in bits of information, the actual relevance of analogies in analogy tests. The main component of this approach is a softaccuracy estimator that also yields entropy estimates with compensated biases. Experimental results obtained with pre-trained GloVe 300-D vectors and two public analogy test sets show that proximity hints are much more relevant than analogies in analogy tests, from an information content perspective. Accordingly, a simple word embedding model is used to predict that analogies carry about one bit of information, which is experimentally corroborated.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes