Nathan S. Upham

20.5CLMay 11, 2023

A Novel Dataset Towards Extracting Virus-Host Interactions

Rasha Alshawi, Atriya Sen, Nathan S. Upham et al.

We describe a novel dataset for the automated recognition of named taxonomic and other entities relevant to the association of viruses with their hosts. We further describe some initial results using pre-trained models on the named-entity recognition (NER) task on this novel dataset. We propose that our dataset of manually annotated abstracts now offers a Gold Standard Corpus for training future NER models in the automated extraction of host-pathogen detection methods from scientific publications, and further explain how our work makes first steps towards predicting the important human health-related concept of viral spillover risk automatically from the scientific literature.

Nathan S. Upham

1 Paper