Building a Corpus for Biomedical Relation Extraction of Species Mentions
This work addresses the need for structured data in biomedical research, specifically for analyzing species interactions, but it is incremental as it builds on existing annotation tools and models.
The authors tackled the problem of extracting binary relations between species in biomedical texts by creating a manually annotated corpus called Species-Species Interaction, focusing on gut microbiota, and reported promising initial results using BERT and its biomedical variants.
We present a manually annotated corpus, Species-Species Interaction, for extracting meaningful binary relations between species, in biomedical texts, at sentence level, with a focus on the gut microbiota. The corpus leverages PubTator to annotate species in full-text articles after evaluating different Named Entity Recognition species taggers. Our first results are promising for extracting relations between species using BERT and its biomedical variants.