CLAIDec 25, 2023

Alleviating Hallucinations of Large Language Models through Induced Hallucinations

arXiv:2312.15710v292 citationsh-index: 20NAACL
Originality Incremental advance
AI Analysis

This addresses the issue of inaccurate information generation in LLMs, which is critical for reliable AI applications, but the method is incremental as it builds on existing decoding techniques.

The paper tackles the problem of hallucinations in large language models by proposing an Induce-then-Contrast Decoding strategy, which enhances factuality by penalizing induced untruthful predictions during decoding, achieving performance comparable to ChatGPT and GPT4 on benchmarks like TruthfulQA.

Despite their impressive capabilities, large language models (LLMs) have been observed to generate responses that include inaccurate or fabricated information, a phenomenon commonly known as ``hallucination''. In this work, we propose a simple \textit{Induce-then-Contrast} Decoding (ICD) strategy to alleviate hallucinations. We first construct a factually weak LLM by inducing hallucinations from the original LLMs. Then, we penalize these induced hallucinations during decoding to enhance the factuality of the generated content. Concretely, we determine the final next-token predictions by amplifying the predictions from the original model and downplaying the induced untruthful predictions via contrastive decoding. Experimental results on both discrimination-based and generation-based hallucination evaluation benchmarks, such as TruthfulQA and \textsc{FActScore}, demonstrate that our proposed ICD methods can effectively enhance the factuality of LLMs across various model sizes and families. For example, when equipped with ICD, Llama2-7B-Chat and Mistral-7B-Instruct achieve performance comparable to ChatGPT and GPT4 on TruthfulQA, respectively.

Code Implementations2 repos
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes