CL AI LGAug 22, 2024

SLM Meets LLM: Balancing Latency, Interpretability and Consistency in Hallucination Detection

Mengya Hu, Rui Xu, Deren Lei, Yaxi Li, Mingyu Wang, Emily Ching, Eslam Kamal, Alex Deng

arXiv:2408.12748v13.48 citationsh-index: 7Has Code

Originality Incremental advance

AI Analysis

This work addresses latency and interpretability in hallucination detection for real-time applications, but it appears incremental as it combines existing SLM and LLM components with prompting techniques.

The paper tackles the latency issue of large language models (LLMs) in real-time hallucination detection by proposing a framework that uses a small language model (SLM) for initial detection and an LLM for generating explanations, resulting in optimized real-time interpretable detection.

Large language models (LLMs) are highly capable but face latency challenges in real-time applications, such as conducting online hallucination detection. To overcome this issue, we propose a novel framework that leverages a small language model (SLM) classifier for initial detection, followed by a LLM as constrained reasoner to generate detailed explanations for detected hallucinated content. This study optimizes the real-time interpretable hallucination detection by introducing effective prompting techniques that align LLM-generated explanations with SLM decisions. Empirical experiment results demonstrate its effectiveness, thereby enhancing the overall user experience.

View on arXiv PDF Code

Similar