CLApr 7, 2020

Inexpensive Domain Adaptation of Pretrained Language Models: Case Studies on Biomedical NER and Covid-19 QA

arXiv:2004.03354v41003 citations
AI Analysis

This addresses the problem of expensive domain adaptation for researchers and practitioners in fields like biomedicine, offering a more efficient incremental improvement.

The paper tackled the high cost of domain adaptation for pretrained language models by proposing a cheaper method using Word2Vec and alignment, achieving over 60% of BioBERT's performance gain on biomedical NER tasks at 5% of its CO2 footprint and 2% of its compute cost, and demonstrated quick adaptation to Covid-19 QA.

Domain adaptation of Pretrained Language Models (PTLMs) is typically achieved by unsupervised pretraining on target-domain text. While successful, this approach is expensive in terms of hardware, runtime and CO_2 emissions. Here, we propose a cheaper alternative: We train Word2Vec on target-domain text and align the resulting word vectors with the wordpiece vectors of a general-domain PTLM. We evaluate on eight biomedical Named Entity Recognition (NER) tasks and compare against the recently proposed BioBERT model. We cover over 60% of the BioBERT-BERT F1 delta, at 5% of BioBERT's CO_2 footprint and 2% of its cloud compute cost. We also show how to quickly adapt an existing general-domain Question Answering (QA) model to an emerging domain: the Covid-19 pandemic.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes