Multi-Peptide: Multimodality Leveraged Language-Graph Learning of Peptide Properties
This work addresses peptide property prediction for bioinformatics researchers, offering an incremental improvement through multimodal integration.
The study tackled predicting peptide properties by introducing Multi-Peptide, a multimodal approach combining transformer-based language models and Graph Neural Networks, achieving state-of-the-art 86.185% accuracy in hemolysis prediction.
Peptides are essential in biological processes and therapeutics. In this study, we introduce Multi-Peptide, an innovative approach that combines transformer-based language models with Graph Neural Networks (GNNs) to predict peptide properties. We combine PeptideBERT, a transformer model tailored for peptide property prediction, with a GNN encoder to capture both sequence-based and structural features. By employing Contrastive Language-Image Pre-training (CLIP), Multi-Peptide aligns embeddings from both modalities into a shared latent space, thereby enhancing the model's predictive accuracy. Evaluations on hemolysis and nonfouling datasets demonstrate Multi-Peptide's robustness, achieving state-of-the-art 86.185% accuracy in hemolysis prediction. This study highlights the potential of multimodal learning in bioinformatics, paving the way for accurate and reliable predictions in peptide-based research and applications.