CLApr 21, 2020

Train No Evil: Selective Masking for Task-Guided Pre-Training

arXiv:2004.09733v21009 citationsHas Code
AI Analysis

This addresses the efficiency and effectiveness gap in adapting pre-trained models to specific tasks, though it is incremental as it builds on existing pre-train-then-fine-tuning paradigms.

The paper tackles the problem of pre-trained language models not capturing domain- and task-specific patterns due to task-agnostic pre-training and limited fine-tuning data, by introducing a task-guided pre-training stage with selective masking. The method achieves comparable or better performance on sentiment analysis tasks with less than 50% computation cost.

Recently, pre-trained language models mostly follow the pre-train-then-fine-tuning paradigm and have achieved great performance on various downstream tasks. However, since the pre-training stage is typically task-agnostic and the fine-tuning stage usually suffers from insufficient supervised data, the models cannot always well capture the domain-specific and task-specific patterns. In this paper, we propose a three-stage framework by adding a task-guided pre-training stage with selective masking between general pre-training and fine-tuning. In this stage, the model is trained by masked language modeling on in-domain unsupervised data to learn domain-specific patterns and we propose a novel selective masking strategy to learn task-specific patterns. Specifically, we design a method to measure the importance of each token in sequences and selectively mask the important tokens. Experimental results on two sentiment analysis tasks show that our method can achieve comparable or even better performance with less than 50% of computation cost, which indicates our method is both effective and efficient. The source code of this paper can be obtained from https://github.com/thunlp/SelectiveMasking.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes