DLIRApr 30, 2020

Getting Insights from a Large Corpus of Scientific Papers on Specialisted Comprehensive Topics -- the Case of COVID-19

arXiv:2005.00485v11 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for reliable information extraction from scientific papers to combat fake news, though it is incremental in applying existing NLP techniques to a new dataset.

The paper tackled the challenge of analyzing a large corpus of 24,000 COVID-19 scientific papers by developing two NLP and graph-based methods to extract insights on specific sub-topics like virus origin and drug uses, enabling automatic computer-assisted analysis.

COVID-19 is one of the most important topic these days, specifically on search engines and news. While fake news are easily shared, scientific papers are reliable sources where information can be extracted. With about 24,000 scientific publications on COVID-19 and related research on PUBMED, automatic computer-assisted analysis is required. In this paper, we develop two methodologies to get insights on specific sub-topics of interest and latest research sub-topics. They rely on natural language processing and graph-based visualizations. We run these methodologies on two cases: the virus origin and the uses of existing drugs.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes