Near Perfect Protein Multi-Label Classification with Deep Neural Networks
This work addresses protein function prediction for bioinformatics, but it is incremental as it applies existing ANN methods to a specific domain with new architectures.
The authors tackled protein multi-label classification by developing two new artificial neural networks, achieving near-perfect accuracy with AUC scores of 99.99% for 698 UniProt families and 99.45% for 983 Gene Ontology classes.
Artificial neural networks (ANNs) have gained a well-deserved popularity among machine learning tools upon their recent successful applications in image- and sound processing and classification problems. ANNs have also been applied for predicting the family or function of a protein, knowing its residue sequence. Here we present two new ANNs with multi-label classification ability, showing impressive accuracy when classifying protein sequences into 698 UniProt families (AUC=99.99%) and 983 Gene Ontology classes (AUC=99.45%).