CLMar 19, 2014

Using Entropy Estimates for DAG-Based Ontologies

arXiv:1403.4887v24 citations
AI Analysis

This work addresses a specific issue in bioinformatics for researchers analyzing gene functions, but it appears incremental as it builds on existing methods for semantic similarity.

The authors tackled the problem of modeling functional similarity in gene annotations by proposing a novel entropy calculation for DAG-based ontologies to establish information content of terms, and they compared this metric to two others using semantic and sequence similarity.

Motivation: Entropy measurements on hierarchical structures have been used in methods for information retrieval and natural language modeling. Here we explore its application to semantic similarity. By finding shared ontology terms, semantic similarity can be established between annotated genes. A common procedure for establishing semantic similarity is to calculate the descriptiveness (information content) of ontology terms and use these values to determine the similarity of annotations. Most often information content is calculated for an ontology term by analyzing its frequency in an annotation corpus. The inherent problems in using these values to model functional similarity motivates our work. Summary: We present a novel calculation for establishing the entropy of a DAG-based ontology, which can be used in an alternative method for establishing the information content of its terms. We also compare our IC metric to two others using semantic and sequence similarity.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes