CL AI LGOct 14, 2025

Multi-stage Prompt Refinement for Mitigating Hallucinations in Large Language Models

Jung-Woo Shim, Yeong-Joon Ju, Ji-Hoon Park, Seong-Whan Lee

arXiv:2510.12032v14.91 citationsh-index: 4

Originality Incremental advance

AI Analysis

This addresses the issue of unreliable outputs for users of LLMs, but it is incremental as it builds on existing mitigation frameworks.

The paper tackles the problem of hallucinations in large language models caused by ill-formed prompts by introducing Multi-stage Prompt Refinement (MPR), which improves prompt clarity and achieves over 85% win rate on benchmarks in reducing hallucinations.

Recent advancements in large language models (LLMs) have shown strong performance in natural language understanding and generation tasks. However, LLMs continue to encounter challenges with hallucinations, where models generate plausible but incorrect information. While several factors contribute to hallucinations, the impact of ill-formed prompts, prompts with ambiguous wording, incorrect grammar, or incomplete information, was relatively under explored. To address this, we introduce Multi-stage Prompt Refinement (MPR), a framework designed to systematically improve these ill-formed prompts across multiple stages. Each stage addresses specific errors such as punctuation, typographical mistakes, and misuse of key terms, using small language models (SLMs) fine-tuned for these tasks. MPR iteratively enhances the clarity of prompts with additional context and employs a self-reflection mechanism with ranking to prioritize the most relevant input. Experimental results on hallucination benchmarks show that prompts refined by MPR achieve over an 85~\% win rate compared to their original forms, demonstrating its effectiveness in reducing hallucinations and improving LLM output accuracy. Interestingly, we reveal that MPR can be combined with existing post-hoc hallucination mitigation frameworks, further enhancing its versatility. MPR provides a lightweight and adaptable solution for enhancing LLM reliability across various domains.

View on arXiv PDF

Similar