CLJun 24, 2024

The GPT-WritingPrompts Dataset: A Comparative Analysis of Character Portrayal in Short Stories

arXiv:2406.16767v28 citationsHas Code
Originality Synthesis-oriented
AI Analysis

This work addresses the need to understand biases and differences in AI-generated creative content for researchers and practitioners in natural language processing and creative writing, though it is incremental as it builds on existing datasets and methods.

The authors tackled the problem of quantitatively comparing human and machine-generated short stories by augmenting the Reddit WritingPrompts dataset with GPT-3.5 outputs and analyzing emotional and descriptive features across six dimensions, finding significant differences in all dimensions and similar biases related to narrative point-of-view and protagonist gender.

The improved generative capabilities of large language models have made them a powerful tool for creative writing and storytelling. It is therefore important to quantitatively understand the nature of generated stories, and how they differ from human storytelling. We augment the Reddit WritingPrompts dataset with short stories generated by GPT-3.5, given the same prompts. We quantify and compare the emotional and descriptive features of storytelling from both generative processes, human and machine, along a set of six dimensions. We find that generated stories differ significantly from human stories along all six dimensions, and that human and machine generations display similar biases when grouped according to the narrative point-of-view and gender of the main protagonist. We release our dataset and code at https://github.com/KristinHuangg/gpt-writing-prompts.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes