Deep Neural Network Based Precursor microRNA Prediction on Eleven Species
This work addresses the challenge of computationally identifying microRNAs, which is important for biologists studying gene regulation, but it is incremental as it applies a deep learning approach to an existing prediction task.
The authors tackled the problem of predicting precursor microRNA sequences across multiple species by developing DP-miRNA, a deep learning model that outperformed several existing classifiers, including support vector machines and random forests, on eleven datasets.
MicroRNA (miRNA) are small non-coding RNAs that regulates the gene expression at the post-transcriptional level. Determining whether a sequence segment is miRNA is experimentally challenging. Also, experimental results are sensitive to the experimental environment. These limitations inspire the development of computational methods for predicting the miRNAs. We propose a deep learning based classification model, called DP-miRNA, for predicting precursor miRNA sequence that contains the miRNA sequence. The feature set based Restricted Boltzmann Machine method, which we call DP-miRNA, uses 58 features that are categorized into four groups: sequence features, folding measures, stem-loop features and statistical feature. We evaluate the performance of the DP-miRNA on eleven twelve data sets of varying species, including the human. The deep neural network based classification outperformed support vector machine, neural network, naive Baye's classifiers, k-nearest neighbors, random forests, and a hybrid system combining support vector machine and genetic algorithm.