CLAIDec 8, 2024

Steering Large Language Models to Evaluate and Amplify Creativity

arXiv:2412.06060v13 citationsh-index: 5
Originality Incremental advance
AI Analysis

This addresses the challenge of evaluating and improving creativity in AI-generated text for applications in content creation and assessment.

The paper tackled the problem of LLMs being poor judges of creativity by extracting differences in internal states between 'boring' and 'creative' prompts to measure creativity, achieving strong correlation with human judgment and enabling enhanced creativity in generated text.

Although capable of generating creative text, Large Language Models (LLMs) are poor judges of what constitutes "creativity". In this work, we show that we can leverage this knowledge of how to write creatively in order to better judge what is creative. We take a mechanistic approach that extracts differences in the internal states of an LLM when prompted to respond "boringly" or "creatively" to provide a robust measure of creativity that corresponds strongly with human judgment. We also show these internal state differences can be applied to enhance the creativity of generated text at inference time.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes