CLLGJul 20, 2017

High-risk learning: acquiring new word vectors from tiny data

arXiv:1707.06556v11136 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of data scarcity in distributional semantics for NLP applications, though it is incremental as it builds on existing neural language models.

The paper tackles the problem of learning word vectors from tiny data, showing that minor modifications to Word2Vec enable learning new terms from few examples using background knowledge, achieving a large increase in performance over state-of-the-art models on a definitional task.

Distributional semantics models are known to struggle with small data. It is generally accepted that in order to learn 'a good vector' for a word, a model must have sufficient examples of its usage. This contradicts the fact that humans can guess the meaning of a word from a few occurrences only. In this paper, we show that a neural language model such as Word2Vec only necessitates minor modifications to its standard architecture to learn new terms from tiny data, using background knowledge from a previously learnt semantic space. We test our model on word definitions and on a nonce task involving 2-6 sentences' worth of context, showing a large increase in performance over state-of-the-art models on the definitional task.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes