CLApr 24, 2020

Probabilistically Masked Language Model Capable of Autoregressive Generation in Arbitrary Word Order

arXiv:2004.11579v11013 citations
Originality Incremental advance
AI Analysis

This work addresses the problem of integrating NLU and NLG in language models for AI researchers, offering a novel approach that is incremental in bridging masked and autoregressive methods.

The paper introduces a probabilistically masked language model (PMLM) that tackles the challenge of combining natural language understanding and generation capabilities, achieving performance that outperforms BERT on NLU tasks and enables text generation in arbitrary order with good quality.

Masked language model and autoregressive language model are two types of language models. While pretrained masked language models such as BERT overwhelm the line of natural language understanding (NLU) tasks, autoregressive language models such as GPT are especially capable in natural language generation (NLG). In this paper, we propose a probabilistic masking scheme for the masked language model, which we call probabilistically masked language model (PMLM). We implement a specific PMLM with a uniform prior distribution on the masking ratio named u-PMLM. We prove that u-PMLM is equivalent to an autoregressive permutated language model. One main advantage of the model is that it supports text generation in arbitrary order with surprisingly good quality, which could potentially enable new applications over traditional unidirectional generation. Besides, the pretrained u-PMLM also outperforms BERT on a set of downstream NLU tasks.

Code Implementations3 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes