CLNov 27, 2023

Overview of the VLSP 2022 -- Abmusu Shared Task: A Data Challenge for Vietnamese Abstractive Multi-document Summarization

arXiv:2311.15525v11 citationsh-index: 6
Originality Synthesis-oriented
AI Analysis

This addresses the need for automated summarization systems for Vietnamese news, but it is incremental as it focuses on a specific language and dataset.

The paper tackles the problem of Vietnamese abstractive multi-document summarization by introducing a shared task and dataset, resulting in models evaluated using ROUGE2-F1 scores on a dataset of 1,839 documents in 600 clusters.

This paper reports the overview of the VLSP 2022 - Vietnamese abstractive multi-document summarization (Abmusu) shared task for Vietnamese News. This task is hosted at the 9$^{th}$ annual workshop on Vietnamese Language and Speech Processing (VLSP 2022). The goal of Abmusu shared task is to develop summarization systems that could create abstractive summaries automatically for a set of documents on a topic. The model input is multiple news documents on the same topic, and the corresponding output is a related abstractive summary. In the scope of Abmusu shared task, we only focus on Vietnamese news summarization and build a human-annotated dataset of 1,839 documents in 600 clusters, collected from Vietnamese news in 8 categories. Participated models are evaluated and ranked in terms of \texttt{ROUGE2-F1} score, the most typical evaluation metric for document summarization problem.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes