Vapur: A Search Engine to Find Related Protein-Compound Pairs in COVID-19 Literature
This addresses the problem for biomedical researchers of efficiently navigating domain-specific terminology in COVID-19 publications, though it is incremental as it applies existing NLP methods to a new dataset.
The authors tackled the challenge of finding relevant protein-compound pairs in COVID-19 literature by developing Vapur, a specialized search engine that uses a relation-oriented inverted index, resulting in a publicly available tool for domain researchers.
Coronavirus Disease of 2019 (COVID-19) created dire consequences globally and triggered an intense scientific effort from different domains. The resulting publications created a huge text collection in which finding the studies related to a biomolecule of interest is challenging for general purpose search engines because the publications are rich in domain specific terminology. Here, we present Vapur: an online COVID-19 search engine specifically designed to find related protein - chemical pairs. Vapur is empowered with a relation-oriented inverted index that is able to retrieve and group studies for a query biomolecule with respect to its related entities. The inverted index of Vapur is automatically created with a BioNLP pipeline and integrated with an online user interface. The online interface is designed for the smooth traversal of the current literature by domain researchers and is publicly available at https://tabilab.cmpe.boun.edu.tr/vapur/ .