CLAug 5, 2021

Robust Transfer Learning with Pretrained Language Models through Adapters

arXiv:2108.02340v1724 citations
Originality Incremental advance
AI Analysis

This addresses robustness issues in transfer learning for NLP, though it is incremental as it builds on existing adapter methods.

The paper tackles the instability and vulnerability of fine-tuning large pretrained language models by proposing an adapter-based approach, which improves stability and adversarial robustness across various downstream tasks.

Transfer learning with large pretrained transformer-based language models like BERT has become a dominating approach for most NLP tasks. Simply fine-tuning those large language models on downstream tasks or combining it with task-specific pretraining is often not robust. In particular, the performance considerably varies as the random seed changes or the number of pretraining and/or fine-tuning iterations varies, and the fine-tuned model is vulnerable to adversarial attack. We propose a simple yet effective adapter-based approach to mitigate these issues. Specifically, we insert small bottleneck layers (i.e., adapter) within each layer of a pretrained model, then fix the pretrained layers and train the adapter layers on the downstream task data, with (1) task-specific unsupervised pretraining and then (2) task-specific supervised training (e.g., classification, sequence labeling). Our experiments demonstrate that such a training scheme leads to improved stability and adversarial robustness in transfer learning to various downstream tasks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes