CLDec 21, 2020

Leveraging ParsBERT and Pretrained mT5 for Persian Abstractive Text Summarization

arXiv:2012.11204v134 citations
AI Analysis

This work provides a baseline for future research in Persian abstractive text summarization, a domain that currently lacks established methods and datasets.

This paper tackles Persian abstractive text summarization by fine-tuning mT5 and an encoder-decoder version of ParsBERT on a newly introduced dataset called pn-summary. The models achieved promising results, establishing a baseline for future research in this area.

Text summarization is one of the most critical Natural Language Processing (NLP) tasks. More and more researches are conducted in this field every day. Pre-trained transformer-based encoder-decoder models have begun to gain popularity for these tasks. This paper proposes two methods to address this task and introduces a novel dataset named pn-summary for Persian abstractive text summarization. The models employed in this paper are mT5 and an encoder-decoder version of the ParsBERT model (i.e., a monolingual BERT model for Persian). These models are fine-tuned on the pn-summary dataset. The current work is the first of its kind and, by achieving promising results, can serve as a baseline for any future work.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes