CL AI IROct 31, 2021

R-BERT-CNN: Drug-target interactions extraction from biomedical literature

Jehad Aldahdooh, Ziaurrehman Tanoli, Jing Tang

arXiv:2111.00611v10.76 citations

Originality Synthesis-oriented

AI Analysis

This work addresses the challenge of manually extracting interactions from over 32 million articles for researchers in biomedicine, but it is incremental as it builds on existing methods for a specific task.

The paper tackled the problem of automatically extracting drug-target interactions from biomedical literature to aid drug discovery, achieving a micro F1 score of 55.67% on the BioCreative VII test corpus and 63% on BioCreative VI.

In this research, we present our work participation for the DrugProt task of BioCreative VII challenge. Drug-target interactions (DTIs) are critical for drug discovery and repurposing, which are often manually extracted from the experimental articles. There are >32M biomedical articles on PubMed and manually extracting DTIs from such a huge knowledge base is challenging. To solve this issue, we provide a solution for Track 1, which aims to extract 10 types of interactions between drug and protein entities. We applied an Ensemble Classifier model that combines BioMed-RoBERTa, a state of art language model, with Convolutional Neural Networks (CNN) to extract these relations. Despite the class imbalances in the BioCreative VII DrugProt test corpus, our model achieves a good performance compared to the average of other submissions in the challenge, with the micro F1 score of 55.67% (and 63% on BioCreative VI ChemProt test corpus). The results show the potential of deep learning in extracting various types of DTIs.

View on arXiv PDF

Similar