CLAINov 17, 2022

Summarizing Community-based Question-Answer Pairs

arXiv:2211.09892v1290 citationsh-index: 19
Originality Incremental advance
AI Analysis

This addresses the difficulty for users in digesting key information from numerous CQA pairs in domains like E-commerce, travel, and dining, though it is incremental as it builds on existing summarization methods.

The authors tackled the problem of information overload in Community-based Question Answering (CQA) by proposing a novel summarization task to create concise summaries from CQA pairs, establishing a benchmark dataset CoQASUM and a strong baseline method DedupLED.

Community-based Question Answering (CQA), which allows users to acquire their desired information, has increasingly become an essential component of online services in various domains such as E-commerce, travel, and dining. However, an overwhelming number of CQA pairs makes it difficult for users without particular intent to find useful information spread over CQA pairs. To help users quickly digest the key information, we propose the novel CQA summarization task that aims to create a concise summary from CQA pairs. To this end, we first design a multi-stage data annotation process and create a benchmark dataset, CoQASUM, based on the Amazon QA corpus. We then compare a collection of extractive and abstractive summarization methods and establish a strong baseline approach DedupLED for the CQA summarization task. Our experiment further confirms two key challenges, sentence-type transfer and deduplication removal, towards the CQA summarization task. Our data and code are publicly available.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes