PURR: Efficiently Editing Language Model Hallucinations by Denoising Language Model Corruptions
This addresses the issue of false claims in language models for users needing efficient and accurate editing, though it is incremental as it builds on existing editing approaches.
The paper tackles the problem of language model hallucinations by introducing PURR, a compact editor that denoises corruptions to improve attribution and achieves orders of magnitude faster execution times compared to existing methods.
The remarkable capabilities of large language models have been accompanied by a persistent drawback: the generation of false and unsubstantiated claims commonly known as "hallucinations". To combat this issue, recent research has introduced approaches that involve editing and attributing the outputs of language models, particularly through prompt-based editing. However, the inference cost and speed of using large language models for editing currently bottleneck prompt-based methods. These bottlenecks motivate the training of compact editors, which is challenging due to the scarcity of training data for this purpose. To overcome these challenges, we exploit the power of large language models to introduce corruptions (i.e., noise) into text and subsequently fine-tune compact editors to denoise the corruptions by incorporating relevant evidence. Our methodology is entirely unsupervised and provides us with faux hallucinations for training in any domain. Our Petite Unsupervised Research and Revision model, PURR, not only improves attribution over existing editing methods based on fine-tuning and prompting, but also achieves faster execution times by orders of magnitude.