CL LGAug 20, 2019

ARAML: A Stable Adversarial Training Framework for Text Generation

Pei Ke, Fei Huang, Minlie Huang, Xiaoyan Zhu

arXiv:1908.07195v11002 citations

AI Analysis

This addresses training instability in text generation for NLP researchers, offering an incremental improvement over existing GAN methods.

The paper tackles the instability of reinforcement learning training in text generation GANs by proposing ARAML, which uses a stationary distribution for rewards and maximum likelihood optimization, resulting in outperforming state-of-the-art models with more stable training.

Most of the existing generative adversarial networks (GAN) for text generation suffer from the instability of reinforcement learning training algorithms such as policy gradient, leading to unstable performance. To tackle this problem, we propose a novel framework called Adversarial Reward Augmented Maximum Likelihood (ARAML). During adversarial training, the discriminator assigns rewards to samples which are acquired from a stationary distribution near the data rather than the generator's distribution. The generator is optimized with maximum likelihood estimation augmented by the discriminator's rewards instead of policy gradient. Experiments show that our model can outperform state-of-the-art text GANs with a more stable training process.

View on arXiv PDF

Similar