GNIRSOC-PHJan 8, 2013

Google matrix analysis of DNA sequences

arXiv:1301.1626v18 citations
Originality Synthesis-oriented
AI Analysis

This work provides a novel network analysis method for DNA sequences, offering insights into genetic similarities and differences, but it is incremental as it applies existing Google matrix techniques to a new biological domain.

The authors analyzed DNA sequences by constructing Google matrices of Markov transitions between nearby words, revealing a power-law distribution of matrix elements similar to scale-free networks like the WWW, with a slower algebraic decay in PageRank probability due to differences in ingoing elements. They introduced a PageRank proximity correlator to measure statistical similarity between species, establishing scale-free features in DNA networks.

For DNA sequences of various species we construct the Google matrix G of Markov transitions between nearby words composed of several letters. The statistical distribution of matrix elements of this matrix is shown to be described by a power law with the exponent being close to those of outgoing links in such scale-free networks as the World Wide Web (WWW). At the same time the sum of ingoing matrix elements is characterized by the exponent being significantly larger than those typical for WWW networks. This results in a slow algebraic decay of the PageRank probability determined by the distribution of ingoing elements. The spectrum of G is characterized by a large gap leading to a rapid relaxation process on the DNA sequence networks. We introduce the PageRank proximity correlator between different species which determines their statistical similarity from the view point of Markov chains. The properties of other eigenstates of the Google matrix are also discussed. Our results establish scale-free features of DNA sequence networks showing their similarities and distinctions with the WWW and linguistic networks.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes