CLHCJun 3, 2025

The Reader is the Metric: How Textual Features and Reader Profiles Explain Conflicting Evaluations of AI Creative Writing

arXiv:2506.03310v14 citationsh-index: 6ACL
Originality Incremental advance
AI Analysis

This addresses the problem of inconsistent AI creative writing evaluations for researchers and practitioners, offering a reader-sensitive framework, though it is incremental in refining evaluation methods.

The study tackled conflicting evaluations of AI-generated vs. human-authored literary texts by showing that differences arise from reader preferences, not intrinsic text quality, using 1,471 stories and 101 annotators to identify two reader profiles that explain quality assessments.

Recent studies comparing AI-generated and human-authored literary texts have produced conflicting results: some suggest AI already surpasses human quality, while others argue it still falls short. We start from the hypothesis that such divergences can be largely explained by genuine differences in how readers interpret and value literature, rather than by an intrinsic quality of the texts evaluated. Using five public datasets (1,471 stories, 101 annotators including critics, students, and lay readers), we (i) extract 17 reference-less textual features (e.g., coherence, emotional variance, average sentence length...); (ii) model individual reader preferences, deriving feature importance vectors that reflect their textual priorities; and (iii) analyze these vectors in a shared "preference space". Reader vectors cluster into two profiles: 'surface-focused readers' (mainly non-experts), who prioritize readability and textual richness; and 'holistic readers' (mainly experts), who value thematic development, rhetorical variety, and sentiment dynamics. Our results quantitatively explain how measurements of literary quality are a function of how text features align with each reader's preferences. These findings advocate for reader-sensitive evaluation frameworks in the field of creative text generation.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes