End-to-End Optimized Pipeline for Prediction of Protein Folding Kinetics
This work addresses the need for early detection of protein folding discrepancies to prevent degenerative diseases, offering a domain-specific incremental improvement.
The research tackled the problem of predicting protein folding kinetics by proposing an efficient pipeline that achieved a 4.8% higher accuracy, 327x lower memory usage, and 7.3% faster speed compared to state-of-the-art ML models.
Protein folding is the intricate process by which a linear sequence of amino acids self-assembles into a unique three-dimensional structure. Protein folding kinetics is the study of pathways and time-dependent mechanisms a protein undergoes when it folds. Understanding protein kinetics is essential as a protein needs to fold correctly for it to perform its biological functions optimally, and a misfolded protein can sometimes be contorted into shapes that are not ideal for a cellular environment giving rise to many degenerative, neuro-degenerative disorders and amyloid diseases. Monitoring at-risk individuals and detecting protein discrepancies in a protein's folding kinetics at the early stages could majorly result in public health benefits, as preventive measures can be taken. This research proposes an efficient pipeline for predicting protein folding kinetics with high accuracy and low memory footprint. The deployed machine learning (ML) model outperformed the state-of-the-art ML models by 4.8% in terms of accuracy while consuming 327x lesser memory and being 7.3% faster.