CLAIDec 21, 2022

OpineSum: Entailment-based self-training for abstractive opinion summarization

arXiv:2212.10791v1223 citationsh-index: 23
Originality Highly original
AI Analysis

This addresses the challenge of opinion summarization for products and places where large-scale labeled datasets are unavailable, offering a practical solution for generating summaries from numerous reviews.

The paper tackles the problem of summarizing large numbers of product or place reviews without extensive labeled data by introducing OpineSum, a self-training approach that uses textual entailment to capture consensus opinions. It achieves state-of-the-art performance in unsupervised and few-shot abstractive summarization settings.

A typical product or place often has hundreds of reviews, and summarization of these texts is an important and challenging problem. Recent progress on abstractive summarization in domains such as news has been driven by supervised systems trained on hundreds of thousands of news articles paired with human-written summaries. However for opinion texts, such large scale datasets are rarely available. Unsupervised methods, self-training, and few-shot learning approaches bridge that gap. In this work, we present a novel self-training approach, OpineSum, for abstractive opinion summarization. The summaries in this approach are built using a novel application of textual entailment and capture the consensus of opinions across the various reviews for an item. This method can be used to obtain silver-standard summaries on a large scale and train both unsupervised and few-shot abstractive summarization systems. OpineSum achieves state-of-the-art performance in both settings.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes