QMNEAPMay 22, 2019

Selection of a Minimal Number of Significant Porcine SNPs by an Information Gain and Genetic Algorithm Hybrid Model

arXiv:1905.09059v17 citations
Originality Synthesis-oriented
AI Analysis

This work addresses the need for efficient genetic profiling in pig breeding, though it is incremental as it combines existing methods.

The paper tackled the problem of selecting a minimal subset of SNPs for porcine breed classification, achieving a reduction to 0.86% of total SNPs with 94.80% classification accuracy.

A panel of large number of common Single Nucleotide Polymorphisms (SNPs) distributed across an entire porcine genome has been widely used to represent genetic variability of pig. With the advent of SNP-array technology, a genome-wide genetic profile of a specimen can be easily observed. Among the large number of such variations, there exist a much smaller subset of the SNP panel that could equally be used to correctly identify the corresponding breed. This work presents a SNP selection heuristic that can still be used effectively in the breed classification process. The proposed feature selection was done by the approach of combining a filter method and a wrapper method--information gain method and genetic algorithm--plus a feature frequency selection step, while classification was done by support vector machine. The approach was able to reduce the number of significant SNPs to 0.86 % of the total number of SNPs in a swine dataset and provided a high classification accuracy of 94.80 %.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes