CLNov 10, 2019

Stylized Text Generation Using Wasserstein Autoencoders with a Mixture of Gaussian Prior

arXiv:1911.03828v13 citations
Originality Incremental advance
AI Analysis

This work addresses the need for controlled text generation in applications like content creation, though it is incremental as it builds on existing Wasserstein autoencoders.

The paper tackles the problem of generating stylized text with control over style and topic in multi-class datasets, achieving diverse and fluent sentences that preserve desired styles without adversarial losses.

Wasserstein autoencoders are effective for text generation. They do not however provide any control over the style and topic of the generated sentences if the dataset has multiple classes and includes different topics. In this work, we present a semi-supervised approach for generating stylized sentences. Our model is trained on a multi-class dataset and learns the latent representation of the sentences using a mixture of Gaussian prior without any adversarial losses. This allows us to generate sentences in the style of a specified class or multiple classes by sampling from their corresponding prior distributions. Moreover, we can train our model on relatively small datasets and learn the latent representation of a specified class by adding external data with other styles/classes to our dataset. While a simple WAE or VAE cannot generate diverse sentences in this case, generated sentences with our approach are diverse, fluent, and preserve the style and the content of the desired classes.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes