CLMay 25, 2018

Lifelong Domain Word Embedding via Meta-Learning

arXiv:1805.09991v139 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of domain-specific NLP tasks with limited data, though it is incremental in applying meta-learning to lifelong embedding learning.

The paper tackles the problem of learning high-quality domain word embeddings when in-domain corpora are small by proposing a lifelong learning approach that uses meta-learning to expand new domain corpora with data from past domains. Experimental results show that this process improves downstream task performance.

Learning high-quality domain word embeddings is important for achieving good performance in many NLP tasks. General-purpose embeddings trained on large-scale corpora are often sub-optimal for domain-specific applications. However, domain-specific tasks often do not have large in-domain corpora for training high-quality domain embeddings. In this paper, we propose a novel lifelong learning setting for domain embedding. That is, when performing the new domain embedding, the system has seen many past domains, and it tries to expand the new in-domain corpus by exploiting the corpora from the past domains via meta-learning. The proposed meta-learner characterizes the similarities of the contexts of the same word in many domain corpora, which helps retrieve relevant data from the past domains to expand the new domain corpus. Experimental results show that domain embeddings produced from such a process improve the performance of the downstream tasks.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes