CLAILGOct 6, 2020

Neural Mask Generator: Learning to Generate Adaptive Word Maskings for Language Model Adaptation

arXiv:2010.02705v1999 citations
Originality Incremental advance
AI Analysis

This addresses the challenge of efficiently fine-tuning language models for domain-specific tasks, representing an incremental improvement over existing masking techniques.

The paper tackles the problem of adapting language models to specific tasks by proposing a method to automatically generate adaptive word maskings for self-supervised pre-training, resulting in improved task performance on question answering and text classification datasets compared to rule-based strategies.

We propose a method to automatically generate a domain- and task-adaptive maskings of the given text for self-supervised pre-training, such that we can effectively adapt the language model to a particular target task (e.g. question answering). Specifically, we present a novel reinforcement learning-based framework which learns the masking policy, such that using the generated masks for further pre-training of the target language model helps improve task performance on unseen texts. We use off-policy actor-critic with entropy regularization and experience replay for reinforcement learning, and propose a Transformer-based policy network that can consider the relative importance of words in a given text. We validate our Neural Mask Generator (NMG) on several question answering and text classification datasets using BERT and DistilBERT as the language models, on which it outperforms rule-based masking strategies, by automatically learning optimal adaptive maskings.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes