CL CRSep 30, 2024

Ingest-And-Ground: Dispelling Hallucinations from Continually-Pretrained LLMs with RAG

Chenhao Fang, Derek Larson, Shitong Zhu, Sophie Zeng, Wendy Summer, Yanqing Peng, Yuriy Hulovatyy, Rajeev Rao, Gabriel Forgues, Arya Pudota, Alex Goncalves, Hervé Robert

arXiv:2410.02825v25 citationsh-index: 2

Originality Incremental advance

AI Analysis

This work aims to improve privacy process efficiency for organizations by reducing LLM hallucinations in privacy-related queries, offering a strong specific gain.

This paper addresses the problem of hallucinations in LLMs when handling privacy-related queries. By continually pre-training an LLM with a privacy-specific knowledge base and augmenting it with a semantic RAG layer, the authors doubled performance metrics compared to an out-of-box LLM.

This paper presents new methods that have the potential to improve privacy process efficiency with LLM and RAG. To reduce hallucination, we continually pre-train the base LLM model with a privacy-specific knowledge base and then augment it with a semantic RAG layer. Our evaluations demonstrate that this approach enhances the model performance (as much as doubled metrics compared to out-of-box LLM) in handling privacy-related queries, by grounding responses with factual information which reduces inaccuracies.

View on arXiv PDF

Similar