CL AIDec 12, 2025

Does Less Hallucination Mean Less Creativity? An Empirical Investigation in LLMs

Mohor Banerjee, Nadya Yuki Wangsajaya, Syed Ali Redha Alsagoff, Min Sen Tan, Zachary Choy Kit Chun, Alvin Chan Guo Wei

arXiv:2512.11509v21 citationsh-index: 8

Originality Incremental advance

AI Analysis

This addresses the trade-off between factual accuracy and creative exploration in AI-assisted scientific discovery, providing practical guidance for method selection.

The study investigated how three hallucination-reduction techniques (CoVe, DoLa, RAG) affect creativity in LLMs, finding that CoVe enhances divergent thinking, DoLa suppresses it, and RAG has minimal impact, based on evaluations across multiple model families and scales on creativity benchmarks.

Large Language Models (LLMs) exhibit remarkable capabilities in natural language understanding and reasoning, but suffer from hallucination: the generation of factually incorrect content. While numerous methods have been developed to reduce hallucinations, their impact on creative generations remains unexplored. This gap is particularly critical for AI-assisted scientific discovery, which requires both factual accuracy and creative hypothesis generation. We investigate how three hallucination-reduction techniques: Chain of Verification (CoVe), Decoding by Contrasting Layers (DoLa), and Retrieval-Augmented Generation (RAG), affect creativity in LLMs. Evaluating multiple model families (LLaMA, Qwen, Mistral) at varying scales (1B - 70B parameters) on two creativity benchmarks (NeoCoder and CS4), we find that these methods have opposing effects on divergent creativity. CoVe enhances divergent thinking, DoLa suppresses it, and RAG shows minimal impact. Our findings provide guidance for selecting appropriate hallucination-reduction methods in scientific applications, where the balance between factual accuracy and creative exploration is crucial.

View on arXiv PDF

Similar