CLMar 5, 2021

Overcoming Poor Word Embeddings with Word Definitions

arXiv:2103.03842v131.5712 citations

Originality Incremental advance

AI Analysis

This addresses a specific bottleneck in NLP for applications dealing with rare words, but it is incremental as it builds on existing embedding methods.

The paper tackled the problem of natural language inference models struggling with rare words by exploring the use of word definitions from natural text to improve performance, recovering most of the performance gap from using untrained words.

Modern natural language understanding models depend on pretrained subword embeddings, but applications may need to reason about words that were never or rarely seen during pretraining. We show that examples that depend critically on a rarer word are more challenging for natural language inference models. Then we explore how a model could learn to use definitions, provided in natural text, to overcome this handicap. Our model's understanding of a definition is usually weaker than a well-modeled word embedding, but it recovers most of the performance gap from using a completely untrained word.

View on arXiv PDF

Similar