CLSep 17, 2021

Task-adaptive Pre-training of Language Models with Word Embedding Regularization

Kosuke Nishida, Kyosuke Nishida, Sen Yoshida

arXiv:2109.08354v131.5712 citations

Originality Incremental advance

AI Analysis

This work addresses domain adaptation for language models, offering an incremental improvement for tasks like biomedical question answering.

The paper tackled the problem of adapting pre-trained language models to specific domains by proposing TAPTER, a fine-tuning process that regularizes word embeddings using target domain data, which improved performance on BioASQ and SQuAD when pre-training corpora lacked in-domain data.

Pre-trained language models (PTLMs) acquire domain-independent linguistic knowledge through pre-training with massive textual resources. Additional pre-training is effective in adapting PTLMs to domains that are not well covered by the pre-training corpora. Here, we focus on the static word embeddings of PTLMs for domain adaptation to teach PTLMs domain-specific meanings of words. We propose a novel fine-tuning process: task-adaptive pre-training with word embedding regularization (TAPTER). TAPTER runs additional pre-training by making the static word embeddings of a PTLM close to the word embeddings obtained in the target domain with fastText. TAPTER requires no additional corpus except for the training data of the downstream task. We confirmed that TAPTER improves the performance of the standard fine-tuning and the task-adaptive pre-training on BioASQ (question answering in the biomedical domain) and on SQuAD (the Wikipedia domain) when their pre-training corpora were not dominated by in-domain data.

View on arXiv PDF

Similar