LGApr 22, 2025

ORION Grounded in Context: Retrieval-Based Method for Hallucination Detection

Assaf Gerner, Netta Madvil, Nadav Barak, Alex Zaikman, Jonatan Liberman, Liron Hamra, Rotem Brazilay, Shay Tsadok, Yaron Friedman, Neal Harow, Noam Bressler, Shir Chorev

arXiv:2504.15771v37.11 citationsh-index: 2

Originality Incremental advance

AI Analysis

This addresses hallucination detection for production-scale LLM applications in domains like summarization and RAG, though it appears incremental as it builds on existing RAG and NLI approaches.

The paper tackles the problem of hallucinated answers in production LLM applications by presenting a retrieval-based framework for hallucination detection. Their method achieves an F1 score of 0.83 on a response-level classification task, matching methods trained on the dataset and outperforming comparable frameworks.

Despite advancements in grounded content generation, production Large Language Models (LLMs) based applications still suffer from hallucinated answers. We present "Grounded in Context" - a member of Deepchecks' ORION (Output Reasoning-based InspectiON) family of lightweight evaluation models. It is our framework for hallucination detection, designed for production-scale long-context data and tailored to diverse use cases, including summarization, data extraction, and RAG. Inspired by RAG architecture, our method integrates retrieval and Natural Language Inference (NLI) models to predict factual consistency between premises and hypotheses using an encoder-based model with only a 512-token context window. Our framework identifies unsupported claims with an F1 score of 0.83 in RAGTruth's response-level classification task, matching methods that trained on the dataset, and outperforming all comparable frameworks using similar-sized models.

View on arXiv PDF

Similar