GNLGQMMay 15, 2017

DeepGO: Predicting protein functions from sequence and interactions using a deep ontology-aware classifier

arXiv:1705.05919v1460 citations
Originality Highly original
AI Analysis

This addresses the need for efficient computational function prediction for proteins in biology, as experimental methods are costly and slow.

The paper tackled the problem of predicting protein functions from sequence and interactions, achieving significant improvement over baseline methods like BLAST, particularly for predicting cellular locations, as evaluated by CAFA standards.

A large number of protein sequences are becoming available through the application of novel high-throughput sequencing technologies. Experimental functional characterization of these proteins is time-consuming and expensive, and is often only done rigorously for few selected model organisms. Computational function prediction approaches have been suggested to fill this gap. The functions of proteins are classified using the Gene Ontology (GO), which contains over 40,000 classes. Additionally, proteins have multiple functions, making function prediction a large-scale, multi-class, multi-label problem. We have developed a novel method to predict protein function from sequence. We use deep learning to learn features from protein sequences as well as a cross-species protein-protein interaction network. Our approach specifically outputs information in the structure of the GO and utilizes the dependencies between GO classes as background information to construct a deep learning model. We evaluate our method using the standards established by the Computational Assessment of Function Annotation (CAFA) and demonstrate a significant improvement over baseline methods such as BLAST, with significant improvement for predicting cellular locations.

Code Implementations1 repo
Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes