CLAIMar 27, 2024

Improving Attributed Text Generation of Large Language Models via Preference Learning

arXiv:2403.18381v132 citationsh-index: 10ACL
Originality Incremental advance
AI Analysis

This work addresses the challenge of misinformation and hallucinations in AI-generated text for users relying on credible information, representing an incremental improvement over existing attribution methods.

The paper tackles the problem of unreliable content generation in large language models by modeling attribution as preference learning, introducing an Automatic Preference Optimization (APO) framework that achieves state-of-the-art citation F1 scores on datasets like ASQA, StrategyQA, and ELI5.

Large language models have been widely adopted in natural language processing, yet they face the challenge of generating unreliable content. Recent works aim to reduce misinformation and hallucinations by resorting to attribution as a means to provide evidence (i.e., citations). However, current attribution methods usually focus on the retrieval stage and automatic evaluation that neglect mirroring the citation mechanisms in human scholarly writing to bolster credibility. In this paper, we address these challenges by modelling the attribution task as preference learning and introducing an Automatic Preference Optimization (APO) framework. First, we create a curated collection for post-training with 6,330 examples by collecting and filtering from existing datasets. Second, considering the high cost of labelling preference data, we further propose an automatic method to synthesize attribution preference data resulting in 95,263 pairs. Moreover, inspired by the human citation process, we further propose a progressive preference optimization method by leveraging fine-grained information. Extensive experiments on three datasets (i.e., ASQA, StrategyQA, and ELI5) demonstrate that APO achieves state-of-the-art citation F1 with higher answer quality.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes