HMACA: Towards Proposing a Cellular Automata Based Tool for Protein Coding, Promoter Region Identification and Protein Structure Prediction
This work addresses a challenging issue in bioinformatics for researchers and practitioners, but it appears incremental as it builds on existing cellular automata methods with modest accuracy improvements.
The paper tackles the problem of identifying protein coding and promoter regions in DNA sequences, as well as predicting protein structures, by developing a cellular automata-based tool called HMACA, which achieves accuracies of 76% for region identification and 80% for structure prediction.
Human body consists of lot of cells, each cell consist of DeOxaRibo Nucleic Acid (DNA). Identifying the genes from the DNA sequences is a very difficult task. But identifying the coding regions is more complex task compared to the former. Identifying the protein which occupy little place in genes is a really challenging issue. For understating the genes coding region analysis plays an important role. Proteins are molecules with macro structure that are responsible for a wide range of vital biochemical functions, which includes acting as oxygen, cell signaling, antibody production, nutrient transport and building up muscle fibers. Promoter region identification and protein structure prediction has gained a remarkable attention in recent years. Even though there are some identification techniques addressing this problem, the approximate accuracy in identifying the promoter region is closely 68% to 72%. We have developed a Cellular Automata based tool build with hybrid multiple attractor cellular automata (HMACA) classifier for protein coding region, promoter region identification and protein structure prediction which predicts the protein and promoter regions with an accuracy of 76%. This tool also predicts the structure of protein with an accuracy of 80%.