CLJan 28, 2025

Few-Shot Optimized Framework for Hallucination Detection in Resource-Limited NLP Systems

Baraa Hikal, Ahmed Nasreldin, Ali Hamdi, Ammar Mohammed

arXiv:2501.16616v18.36 citationsh-index: 3

Originality Incremental advance

AI Analysis

This addresses unreliable outputs in applications like machine translation for resource-limited NLP systems, but it is incremental as it builds on existing methods with optimizations.

The paper tackled hallucination detection in NLP systems by proposing a framework that uses few-shot optimization and fine-tuning of Mistral-7B-Instruct-v0.3, achieving 85.5% accuracy on the SHROOM test set.

Hallucination detection in text generation remains an ongoing struggle for natural language processing (NLP) systems, frequently resulting in unreliable outputs in applications such as machine translation and definition modeling. Existing methods struggle with data scarcity and the limitations of unlabeled datasets, as highlighted by the SHROOM shared task at SemEval-2024. In this work, we propose a novel framework to address these challenges, introducing DeepSeek Few-shot optimization to enhance weak label generation through iterative prompt engineering. We achieved high-quality annotations that considerably enhanced the performance of downstream models by restructuring data to align with instruct generative models. We further fine-tuned the Mistral-7B-Instruct-v0.3 model on these optimized annotations, enabling it to accurately detect hallucinations in resource-limited settings. Combining this fine-tuned model with ensemble learning strategies, our approach achieved 85.5% accuracy on the test set, setting a new benchmark for the SHROOM task. This study demonstrates the effectiveness of data restructuring, few-shot optimization, and fine-tuning in building scalable and robust hallucination detection frameworks for resource-constrained NLP systems.

View on arXiv PDF

Similar