CL LGJun 3, 2020

Automatic Text Summarization of COVID-19 Medical Research Articles using BERT and GPT-2

Virapat Kieuvongngam, Bowen Tan, Yiming Niu

arXiv:2006.01997v14.0111 citationsHas Code

Originality Synthesis-oriented

AI Analysis

This work addresses the problem of information overload for medical researchers during the COVID-19 pandemic by providing automated summaries, but it is incremental as it applies existing NLP methods to a new dataset.

The authors tackled the challenge of summarizing COVID-19 medical research articles to help the medical community keep up with rapidly growing literature, using BERT and GPT-2 models to generate abstractive summaries based on extracted keywords, achieving results evaluated with ROUGE scores and visual inspection.

With the COVID-19 pandemic, there is a growing urgency for medical community to keep up with the accelerating growth in the new coronavirus-related literature. As a result, the COVID-19 Open Research Dataset Challenge has released a corpus of scholarly articles and is calling for machine learning approaches to help bridging the gap between the researchers and the rapidly growing publications. Here, we take advantage of the recent advances in pre-trained NLP models, BERT and OpenAI GPT-2, to solve this challenge by performing text summarization on this dataset. We evaluate the results using ROUGE scores and visual inspection. Our model provides abstractive and comprehensive information based on keywords extracted from the original articles. Our work can help the the medical community, by providing succinct summaries of articles for which the abstract are not already available.

View on arXiv PDF Code

Similar