Binary classification of proteins by a Machine Learning approach
This work addresses protein classification for bioinformatics, but it is incremental as it applies an existing deep learning method to a specific dataset.
The authors tackled protein classification by developing a deep learning system using a Convolutional Neural Network to classify protein chains based on chemical-physical-geometric properties from the Protein Data Bank, resulting in a prototypical machinery validated on amino acid sequences.
In this work we present a system based on a Deep Learning approach, by using a Convolutional Neural Network, capable of classifying protein chains of amino acids based on the protein description contained in the Protein Data Bank. Each protein is fully described in its chemical-physical-geometric properties in a file in XML format. The aim of the work is to design a prototypical Deep Learning machinery for the collection and management of a huge amount of data and to validate it through its application to the classification of a sequences of amino acids. We envisage applying the described approach to more general classification problems in biomolecules, related to structural properties and similarities.