LLM Review: Enhancing Creative Writing via Blind Peer Review Feedback
This addresses the challenge of content homogenization in multi-agent creative writing systems for AI researchers and developers.
The paper tackles the problem of LLMs struggling with creative generation by introducing LLM Review, a peer-review-inspired framework that preserves divergent creative trajectories through blind feedback exchange, and demonstrates that this framework consistently outperforms multi-agent baselines while enabling smaller models to surpass larger single-agent models.
Large Language Models (LLMs) often struggle with creative generation, and multi-agent frameworks that improve reasoning through interaction can paradoxically hinder creativity by inducing content homogenization. We introduce LLM Review, a peer-review-inspired framework implementing Blind Peer Review: agents exchange targeted feedback while revising independently, preserving divergent creative trajectories. To enable rigorous evaluation, we propose SciFi-100, a science fiction writing dataset with a unified framework combining LLM-as-a-judge scoring, human annotation, and rule-based novelty metrics. Experiments demonstrate that LLM Review consistently outperforms multi-agent baselines, and smaller models with our framework can surpass larger single-agent models, suggesting interaction structure may substitute for model scale.