QMLGGNBMJul 12, 2024

Topology-enhanced machine learning model (Top-ML) for anticancer peptide prediction

arXiv:2407.08974v42 citationsh-index: 10
Originality Incremental advance
AI Analysis

This work addresses the problem of identifying therapeutic peptides for cancer treatment, offering an interpretable method that is incremental in improving featurization for existing machine learning models.

The paper tackled the bottleneck of inefficient peptide featurization in AI-based anticancer peptide prediction by proposing Top-ML, which uses topological features from sequence connection information, achieving state-of-the-art or comparable performance on benchmark datasets like AntiCP 2.0 and mACPpred 2.0.

Recently, therapeutic peptides have demonstrated great promise for cancer treatment. To explore powerful anticancer peptides, artificial intelligence (AI)-based approaches have been developed to systematically screen potential candidates. However, the lack of efficient featurization of peptides has become a bottleneck for these machine-learning models. In this paper, we propose a topology-enhanced machine learning model (Top-ML) for anticancer peptides prediction. Our Top-ML employs peptide topological features derived from its sequence "connection" information characterized by vector and spectral descriptors. Our Top-ML model, employing an Extra-Trees classifier, has been validated on the AntiCP 2.0 and mACPpred 2.0 benchmark datasets, achieving state-of-the-art performance or results comparable to existing deep learning models, while providing greater interpretability. Our results highlight the potential of leveraging novel topology-based featurization to accelerate the identification of anticancer peptides.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes