AIApr 12, 2022

Make The Most of Prior Data: A Solution for Interactive Text Summarization with Preference Feedback

Duy-Hung Nguyen, Nguyen Viet Dung Nghiem, Bao-Sinh Nguyen, Dung Tien Le, Shahab Sabahi, Minh-Tien Nguyen, Hung Le

arXiv:2204.05512v256.5634 citationsh-index: 26

Originality Incremental advance

AI Analysis

This work addresses the challenge of aligning AI-generated summaries with human interests in practical, interactive settings, though it appears incremental as it builds on existing preference learning methods.

The paper tackles the problem of training summarization models with interactive human preference feedback, which is critical due to scarce and ambiguous ground-truth summaries, and reports improvements in ROUGE scores and sample-efficiency across three datasets.

For summarization, human preference is critical to tame outputs of the summarizer in favor of human interests, as ground-truth summaries are scarce and ambiguous. Practical settings require dynamic exchanges between human and AI agent wherein feedback is provided in an online manner, a few at a time. In this paper, we introduce a new framework to train summarization models with preference feedback interactively. By properly leveraging offline data and a novel reward model, we improve the performance regarding ROUGE scores and sample-efficiency. Our experiments on three various datasets confirm the benefit of the proposed framework in active, few-shot and online settings of preference learning.

View on arXiv PDF

Similar