Romantic-Computing
This work addresses the problem of evaluating AI creativity in poetry generation for researchers, though it is incremental as it applies existing models to a new creative context.
The paper compared text generation models for writing poetry in early English Romantic style, finding that transformer models outperformed character-level RNNs and quality improved with increased parameter size, as measured by syllable count and GRUEN metric.
In this paper we compare various text generation models' ability to write poetry in the style of early English Romanticism. These models include: Character-Level Recurrent Neural Networks with Long Short-Term Memory, Hugging Face's GPT-2, OpenAI's GPT-3, and EleutherAI's GPT-NEO. Quality was measured based syllable count and coherence with the automatic evaluation metric GRUEN. Character-Level Recurrent Neural Networks performed far worse compared to transformer models. And, as parameter-size increased, the quality of transformer models' poems improved. These models are typically not compared in a creative context, and we are happy to contribute.