CL AIDec 8, 2023

FREDSum: A Dialogue Summarization Corpus for French Political Debates

Virgile Rennard, Guokan Shang, Damien Grari, Julie Hunter, Michalis Vazirgiannis

arXiv:2312.04843v121.3133 citationsh-index: 58Has CodeEMNLP

Originality Synthesis-oriented

AI Analysis

This addresses the problem of limited multilingual datasets for dialogue summarization researchers, though it is incremental as it extends existing methods to a new language and domain.

The authors tackled the lack of resources for multi-party dialogue summarization in non-English languages by creating FREDSum, a dataset of French political debates with manual transcriptions and annotations, and provided baseline experiments using state-of-the-art methods.

Recent advances in deep learning, and especially the invention of encoder-decoder architectures, has significantly improved the performance of abstractive summarization systems. The majority of research has focused on written documents, however, neglecting the problem of multi-party dialogue summarization. In this paper, we present a dataset of French political debates for the purpose of enhancing resources for multi-lingual dialogue summarization. Our dataset consists of manually transcribed and annotated political debates, covering a range of topics and perspectives. We highlight the importance of high quality transcription and annotations for training accurate and effective dialogue summarization models, and emphasize the need for multilingual resources to support dialogue summarization in non-English languages. We also provide baseline experiments using state-of-the-art methods, and encourage further research in this area to advance the field of dialogue summarization. Our dataset will be made publicly available for use by the research community.

View on arXiv PDF Code

Similar