Predicting clinical significance of BRCA1 and BRCA2 single nucleotide substitution variants with unknown clinical significance using probabilistic neural network and deep neural network-stacked autoencoder
This work addresses the need for faster and more accurate computational methods to predict breast cancer risk from genetic variants, which is incremental over previous attempts.
The paper tackled the problem of predicting the clinical significance of BRCA1 and BRCA2 single nucleotide substitution variants, achieving accuracies of up to 95.41% for BRCA1 and 92.80% for BRCA2 using a deep neural network-stacked autoencoder, with processing times as low as 0.9 seconds for training and testing.
Non-synonymous single nucleotide polymorphisms (nsSNPs) are single nucleotide substitution occurring in the coding region of a gene and leads to a change in amino-acid sequence of protein. The studies have shown these variations may be associated with disease. Thus, investigating the effects of nsSNPs on protein function will give a greater insight on how nsSNPs can lead into disease. Breast cancer is the most common cancer among women causing highest cancer death every year. BRCA1 and BRCA2 tumor suppressor genes are two main candidates of which, mutations in them can increase the risk of developing breast cancer. For prediction and detection of the cancer one can use experimental or computational methods, but the experimental method is very costly and time consuming in comparison with the computational method. The computer and computational methods have been used for more than 30 years. Here we try to predict the clinical significance of BRCA1 and BRCA2 nsSNPs as well as the unknown clinical significances. Nearly 500 BRCA1 and BRCA2 nsSNPs with known clinical significances retrieved from NCBI database. Based on hydrophobicity or hydrophilicity and their role in proteins' second structure, they are divided into 6 groups, each assigned with scores. The data are prepared in the acceptable form to the automated prediction mechanisms, Probabilistic Neural Network (PNN) and Deep Neural NetworkStacked AutoEncoder (DNN). With Jackknife cross validation we show that the prediction accuracy achieved for BRCA1 and BRCA2 using PNN are 87.97% and 82.17% respectively, while 95.41% and 92.80% accuracies achieved using DNN. The total required processing time for the training and testing the PNN is 0.9 second and DNN requires about 7 hours of training and it can predict instantly. both methods show great improvement in accuracy and speed compared to previous attempts.