LGDec 15, 2020

Bayesian neural network with pretrained protein embedding enhances prediction accuracy of drug-protein interaction

arXiv:2012.08194v238 citations
AI Analysis

This work provides an incremental improvement in drug-protein interaction prediction, which is important for drug discovery researchers facing small labeled datasets.

This paper addresses the challenge of predicting drug-protein interactions with limited labeled data. They propose a deep learning framework that leverages transfer learning with pretrained protein embeddings and a Bayesian neural network to improve prediction accuracy, outperforming previous baselines.

The characterization of drug-protein interactions is crucial in the high-throughput screening for drug discovery. The deep learning-based approaches have attracted attention because they can predict drug-protein interactions without trial-and-error by humans. However, because data labeling requires significant resources, the available protein data size is relatively small, which consequently decreases model performance. Here we propose two methods to construct a deep learning framework that exhibits superior performance with a small labeled dataset. At first, we use transfer learning in encoding protein sequences with a pretrained model, which trains general sequence representations in an unsupervised manner. Second, we use a Bayesian neural network to make a robust model by estimating the data uncertainty. As a result, our model performs better than the previous baselines for predicting drug-protein interactions. We also show that the quantified uncertainty from the Bayesian inference is related to the confidence and can be used for screening DPI data points.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes