CLOct 21, 2024

Can Knowledge Editing Really Correct Hallucinations?

Baixiang Huang, Canyu Chen, Xiongxiao Xu, Ali Payani, Kai Shu

arXiv:2410.16251v315.437 citationsh-index: 16Has CodeICLR

Originality Incremental advance

AI Analysis

This addresses the need for better evaluation of knowledge editing techniques to correct hallucinations in LLMs, which is an incremental but important step for improving model reliability.

The paper tackles the problem of evaluating whether knowledge editing methods can correct hallucinations in Large Language Models by creating HalluEditBench, a comprehensive benchmark with over 6,000 hallucinations across 9 domains, and finds that it provides new insights into the potentials and limitations of different methods.

Large Language Models (LLMs) suffer from hallucinations, referring to the non-factual information in generated content, despite their superior capacities across tasks. Meanwhile, knowledge editing has been developed as a new popular paradigm to correct erroneous factual knowledge encoded in LLMs with the advantage of avoiding retraining from scratch. However, a common issue of existing evaluation datasets for knowledge editing is that they do not ensure that LLMs actually generate hallucinated answers to the evaluation questions before editing. When LLMs are evaluated on such datasets after being edited by different techniques, it is hard to directly adopt the performance to assess the effectiveness of different knowledge editing methods in correcting hallucinations. Thus, the fundamental question remains insufficiently validated: Can knowledge editing really correct hallucinations in LLMs? We proposed HalluEditBench to holistically benchmark knowledge editing methods in correcting real-world hallucinations. First, we rigorously construct a massive hallucination dataset with 9 domains, 26 topics and more than 6,000 hallucinations. Then, we assess the performance of knowledge editing methods in a holistic way on five dimensions including Efficacy, Generalization, Portability, Locality, and Robustness. Through HalluEditBench, we have provided new insights into the potentials and limitations of different knowledge editing methods in correcting hallucinations, which could inspire future improvements and facilitate progress in the field of knowledge editing.

View on arXiv PDF Code

Similar