Lifelong Domain Word Embedding via Meta-Learning
This addresses the challenge of domain-specific NLP tasks with limited data, though it is incremental in applying meta-learning to lifelong embedding learning.
The paper tackles the problem of learning high-quality domain word embeddings when in-domain corpora are small by proposing a lifelong learning approach that uses meta-learning to expand new domain corpora with data from past domains. Experimental results show that this process improves downstream task performance.
Learning high-quality domain word embeddings is important for achieving good performance in many NLP tasks. General-purpose embeddings trained on large-scale corpora are often sub-optimal for domain-specific applications. However, domain-specific tasks often do not have large in-domain corpora for training high-quality domain embeddings. In this paper, we propose a novel lifelong learning setting for domain embedding. That is, when performing the new domain embedding, the system has seen many past domains, and it tries to expand the new in-domain corpus by exploiting the corpora from the past domains via meta-learning. The proposed meta-learner characterizes the similarities of the contexts of the same word in many domain corpora, which helps retrieve relevant data from the past domains to expand the new domain corpus. Experimental results show that domain embeddings produced from such a process improve the performance of the downstream tasks.