CL AI IR LGJan 10, 2021

Summaformers @ LaySumm 20, LongSumm 20

Sayar Ghosh Roy, Nikhil Pinnaparaju, Risubh Jain, Manish Gupta, Vasudeva Varma

arXiv:2101.03553v1999 citations

Originality Incremental advance

AI Analysis

This work provides an incremental improvement in automatic text summarization for scientific papers, which could benefit researchers and the general public by making complex information more accessible.

This paper addresses the problem of summarizing scientific research papers, differentiating between short, layman-friendly summaries (LaySumm) and longer, detailed summaries (LongSumm). The authors developed Transformer-based systems that leverage the contribution of specific paper sections to human summaries, achieving first place in the LongSumm task and third place in the LaySumm task on blind test corpora.

Automatic text summarization has been widely studied as an important task in natural language processing. Traditionally, various feature engineering and machine learning based systems have been proposed for extractive as well as abstractive text summarization. Recently, deep learning based, specifically Transformer-based systems have been immensely popular. Summarization is a cognitively challenging task - extracting summary worthy sentences is laborious, and expressing semantics in brief when doing abstractive summarization is complicated. In this paper, we specifically look at the problem of summarizing scientific research papers from multiple domains. We differentiate between two types of summaries, namely, (a) LaySumm: A very short summary that captures the essence of the research paper in layman terms restricting overtly specific technical jargon and (b) LongSumm: A much longer detailed summary aimed at providing specific insights into various ideas touched upon in the paper. While leveraging latest Transformer-based models, our systems are simple, intuitive and based on how specific paper sections contribute to human summaries of the two types described above. Evaluations against gold standard summaries using ROUGE metrics prove the effectiveness of our approach. On blind test corpora, our system ranks first and third for the LongSumm and LaySumm tasks respectively.

View on arXiv PDF

Similar