The Capacity of String-Replication Systems
This work addresses a theoretical problem in computational biology and information theory, with potential implications for understanding genomic evolution, but it is incremental as it builds on known concepts of string replication.
The paper investigates the capacity of string-replication systems to generate exponentially many sequences from short initial ones, providing exact capacities and bounds for four fundamental systems.
It is known that the majority of the human genome consists of repeated sequences. Furthermore, it is believed that a significant part of the rest of the genome also originated from repeated sequences and has mutated to its current form. In this paper, we investigate the possibility of constructing an exponentially large number of sequences from a short initial sequence and simple replication rules, including those resembling genomic replication processes. In other words, our goal is to find out the capacity, or the expressive power, of these string-replication systems. Our results include exact capacities, and bounds on the capacities, of four fundamental string-replication systems.