CL LGMay 15, 2023

TESS: Text-to-Text Self-Conditioned Simplex Diffusion

Rabeeh Karimi Mahabadi, Hamish Ivison, Jaesung Tae, James Henderson, Iz Beltagy, Matthew E. Peters, Arman Cohan

arXiv:2305.08379v222.9138 citationsHas Code

Originality Highly original

AI Analysis

This addresses the problem of expensive and inefficient text generation for researchers and practitioners in NLP, offering a competitive alternative to autoregressive models, though it is incremental in improving diffusion-based methods.

The paper tackles the challenge of applying continuous diffusion models to natural language generation by proposing TESS, a text diffusion model that operates on the logit simplex space with self-conditioning, resulting in outperforming state-of-the-art non-autoregressive models and requiring fewer diffusion steps with minimal performance drop.

Diffusion models have emerged as a powerful paradigm for generation, obtaining strong performance in various continuous domains. However, applying continuous diffusion models to natural language remains challenging due to its discrete nature and the need for a large number of diffusion steps to generate text, making diffusion-based generation expensive. In this work, we propose Text-to-text Self-conditioned Simplex Diffusion (TESS), a text diffusion model that is fully non-autoregressive, employs a new form of self-conditioning, and applies the diffusion process on the logit simplex space rather than the learned embedding space. Through extensive experiments on natural language understanding and generation tasks including summarization, text simplification, paraphrase generation, and question generation, we demonstrate that TESS outperforms state-of-the-art non-autoregressive models, requires fewer diffusion steps with minimal drop in performance, and is competitive with pretrained autoregressive sequence-to-sequence models. We publicly release our codebase at https://github.com/allenai/tess-diffusion.

View on arXiv PDF Code

Similar