QMLGMNJul 18, 2012

Protein Function Prediction Based on Kernel Logistic Regression with 2-order Graphic Neighbor Information

arXiv:1207.4463v11 citations
Originality Incremental advance
AI Analysis

This work addresses protein function prediction for bioinformatics, but it is incremental as it extends existing 1-order neighbor methods.

The paper tackled protein function prediction by proposing a 2-order graphic neighbor information feature extraction method, which improved accuracy to 99.05% average overall percentage using RBF kernel logistic regression with a chi-square feature combination.

To enhance the accuracy of protein-protein interaction function prediction, a 2-order graphic neighbor information feature extraction method based on undirected simple graph is proposed in this paper, which extends the 1-order graphic neighbor featureextraction method. And the chi-square test statistical method is also involved in feature combination. To demonstrate the effectiveness of our 2-order graphic neighbor feature, four logistic regression models (logistic regression (abbrev. LR), diffusion kernel logistic regression (abbrev. DKLR), polynomial kernel logistic regression (abbrev. PKLR), and radial basis function (RBF) based kernel logistic regression (abbrev. RBF KLR)) are investigated on the two feature sets. The experimental results of protein function prediction of Yeast Proteome Database (YPD) using the the protein-protein interaction data of Munich Information Center for Protein Sequences (MIPS) show that 2-order graphic neighbor information of proteins can significantly improve the average overall percentage of protein function prediction especially with RBF KLR. And, with a new 5-top chi-square feature combination method, RBF KLR can achieve 99.05% average overall percentage on 2-order neighbor feature combination set.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes