AIApr 12, 2022

Make The Most of Prior Data: A Solution for Interactive Text Summarization with Preference Feedback

arXiv:2204.05512v2634 citationsh-index: 26
Originality Incremental advance
AI Analysis

This work addresses the challenge of aligning AI-generated summaries with human interests in practical, interactive settings, though it appears incremental as it builds on existing preference learning methods.

The paper tackles the problem of training summarization models with interactive human preference feedback, which is critical due to scarce and ambiguous ground-truth summaries, and reports improvements in ROUGE scores and sample-efficiency across three datasets.

For summarization, human preference is critical to tame outputs of the summarizer in favor of human interests, as ground-truth summaries are scarce and ambiguous. Practical settings require dynamic exchanges between human and AI agent wherein feedback is provided in an online manner, a few at a time. In this paper, we introduce a new framework to train summarization models with preference feedback interactively. By properly leveraging offline data and a novel reward model, we improve the performance regarding ROUGE scores and sample-efficiency. Our experiments on three various datasets confirm the benefit of the proposed framework in active, few-shot and online settings of preference learning.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes