CLAIApr 5, 2022

$\textit{latent}$-GLAT: Glancing at Latent Variables for Parallel Text Generation

CMU
arXiv:2204.02030v142 citationsh-index: 60
Originality Incremental advance
AI Analysis

This addresses the efficiency-quality trade-off in text generation for NLP applications, though it appears incremental as an extension of existing GLAT techniques.

The paper tackles the multi-modality problem in parallel text generation by proposing latent-GLAT, which uses discrete latent variables and curriculum learning to eliminate the need for autoregressive training. Results show it outperforms strong baselines without autoregressive models, broadening the application of parallel decoding.

Recently, parallel text generation has received widespread attention due to its success in generation efficiency. Although many advanced techniques are proposed to improve its generation quality, they still need the help of an autoregressive model for training to overcome the one-to-many multi-modal phenomenon in the dataset, limiting their applications. In this paper, we propose $\textit{latent}$-GLAT, which employs the discrete latent variables to capture word categorical information and invoke an advanced curriculum learning technique, alleviating the multi-modality problem. Experiment results show that our method outperforms strong baselines without the help of an autoregressive model, which further broadens the application scenarios of the parallel decoding paradigm.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes