Can the Transformer Be Used as a Drop-in Replacement for RNNs in Text-Generating GANs?
This addresses the challenge of efficient text generation for researchers and practitioners, but the results are incremental as they show a specific failure case rather than a new solution.
The paper tackled the problem of fine-tuned text generation with limited computational budget by attempting to replace LSTM with Transformer in a GAN architecture, but found that the transformer under-performed during pre-training and caused mode collapse, failing as a drop-in replacement.
In this paper we address the problem of fine-tuned text generation with a limited computational budget. For that, we use a well-performing text generative adversarial network (GAN) architecture - Diversity-Promoting GAN (DPGAN), and attempted a drop-in replacement of the LSTM layer with a self-attention-based Transformer layer in order to leverage their efficiency. The resulting Self-Attention DPGAN (SADPGAN) was evaluated for performance, quality and diversity of generated text and stability. Computational experiments suggested that a transformer architecture is unable to drop-in replace the LSTM layer, under-performing during the pre-training phase and undergoing a complete mode collapse during the GAN tuning phase. Our results suggest that the transformer architecture need to be adapted before it can be used as a replacement for RNNs in text-generating GANs.