CL AIFeb 24, 2023

Check Your Facts and Try Again: Improving Large Language Models with External Knowledge and Automated Feedback

Baolin Peng, Michel Galley, Pengcheng He, Hao Cheng, Yujia Xie, Yu Hu, Qiuyuan Huang, Lars Liden, Zhou Yu, Weizhu Chen, Jianfeng Gao

Microsoft

arXiv:2302.12813v331.6513 citationsh-index: 62

Originality Incremental advance

AI Analysis

This addresses the challenge of applying LLMs to real-world, mission-critical applications by mitigating hallucinations, which is an incremental improvement for users relying on accurate AI-generated content.

The paper tackles the problem of hallucinations and lack of external knowledge in large language models (LLMs) like ChatGPT by proposing LLM-Augmenter, a system that grounds responses in external knowledge and uses automated feedback for iterative improvement. It significantly reduces hallucinations without compromising fluency and informativeness, as validated on task-oriented dialog and open-domain question answering.

Large language models (LLMs), such as ChatGPT, are able to generate human-like, fluent responses for many downstream tasks, e.g., task-oriented dialog and question answering. However, applying LLMs to real-world, mission-critical applications remains challenging mainly due to their tendency to generate hallucinations and their inability to use external knowledge. This paper proposes a LLM-Augmenter system, which augments a black-box LLM with a set of plug-and-play modules. Our system makes the LLM generate responses grounded in external knowledge, e.g., stored in task-specific databases. It also iteratively revises LLM prompts to improve model responses using feedback generated by utility functions, e.g., the factuality score of a LLM-generated response. The effectiveness of LLM-Augmenter is empirically validated on two types of scenarios, task-oriented dialog and open-domain question answering. LLM-Augmenter significantly reduces ChatGPT's hallucinations without sacrificing the fluency and informativeness of its responses. We make the source code and models publicly available.

View on arXiv PDF

Similar