CLAIMay 14, 2018

A La Carte Embedding: Cheap but Effective Induction of Semantic Feature Vectors

arXiv:1805.05388v11120 citations
Originality Incremental advance
AI Analysis

This addresses the need for efficient and effective embedding induction in natural language processing, particularly for domain adaptation and transfer learning, though it is incremental as it builds on existing theoretical results.

The paper tackles the problem of inducing embeddings for rare or unseen textual features by introducing a la carte embedding, a method based on a linear transformation using pretrained word vectors, which achieves state-of-the-art results on a nonce task and unsupervised document classification tasks with fewer context examples.

Motivations like domain adaptation, transfer learning, and feature learning have fueled interest in inducing embeddings for rare or unseen words, n-grams, synsets, and other textual features. This paper introduces a la carte embedding, a simple and general alternative to the usual word2vec-based approaches for building such representations that is based upon recent theoretical results for GloVe-like embeddings. Our method relies mainly on a linear transformation that is efficiently learnable using pretrained word vectors and linear regression. This transform is applicable on the fly in the future when a new text feature or rare word is encountered, even if only a single usage example is available. We introduce a new dataset showing how the a la carte method requires fewer examples of words in context to learn high-quality embeddings and we obtain state-of-the-art results on a nonce task and some unsupervised document classification tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes