CLDec 4, 2020

Fine-tuning BERT for Low-Resource Natural Language Understanding via Active Learning

arXiv:2012.02462v10.00998 citations
AI Analysis50

This work addresses the challenge of efficiently adapting powerful pre-trained models like BERT for NLU tasks when very little labeled data is available, which is a common problem for researchers and practitioners in specialized domains.

This paper investigates fine-tuning BERT for natural language understanding in low-resource settings (under 1,000 training data points). It shows that using pool-based active learning can improve model performance by maximizing approximate knowledge gain, and freezing BERT layers further reduces trainable parameters for low-resource suitability.

Recently, leveraging pre-trained Transformer based language models in down stream, task specific models has advanced state of the art results in natural language understanding tasks. However, only a little research has explored the suitability of this approach in low resource settings with less than 1,000 training data points. In this work, we explore fine-tuning methods of BERT -- a pre-trained Transformer based language model -- by utilizing pool-based active learning to speed up training while keeping the cost of labeling new data constant. Our experimental results on the GLUE data set show an advantage in model performance by maximizing the approximate knowledge gain of the model when querying from the pool of unlabeled data. Finally, we demonstrate and analyze the benefits of freezing layers of the language model during fine-tuning to reduce the number of trainable parameters, making it more suitable for low-resource settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes