CLLGAug 20, 2019

ARAML: A Stable Adversarial Training Framework for Text Generation

arXiv:1908.07195v11002 citations
AI Analysis

This addresses training instability in text generation for NLP researchers, offering an incremental improvement over existing GAN methods.

The paper tackles the instability of reinforcement learning training in text generation GANs by proposing ARAML, which uses a stationary distribution for rewards and maximum likelihood optimization, resulting in outperforming state-of-the-art models with more stable training.

Most of the existing generative adversarial networks (GAN) for text generation suffer from the instability of reinforcement learning training algorithms such as policy gradient, leading to unstable performance. To tackle this problem, we propose a novel framework called Adversarial Reward Augmented Maximum Likelihood (ARAML). During adversarial training, the discriminator assigns rewards to samples which are acquired from a stationary distribution near the data rather than the generator's distribution. The generator is optimized with maximum likelihood estimation augmented by the discriminator's rewards instead of policy gradient. Experiments show that our model can outperform state-of-the-art text GANs with a more stable training process.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes