On the Capacity of Sequences of Coloring Channels
This work addresses a theoretical problem in information theory with potential applications in bioinformatics and molecular data storage, but it appears incremental as it builds on existing channel models.
The paper tackles the problem of determining the exact capacities of sequences of coloring channels, which model sequence-reconstruction problems in protein identification and molecular information storage. It provides exact capacities for uniform sunflowers, two arbitrary intersecting sets, and paths, and shows that capacity depends solely on a defined pairs graph, with bounds proven for cycles.
A single coloring channel is defined by a subset of letters it allows to pass through, while deleting all others. A sequence of coloring channels provides multiple views of the same transmitted letter sequence, forming a type of sequence-reconstruction problem useful for protein identification and information storage at the molecular level. We provide exact capacities of several sequences of coloring channels: uniform sunflowers, two arbitrary intersecting sets, and paths. We also show how this capacity depends solely on a related graph we define, called the pairs graph. Using this equivalence, we prove lower and upper bounds on the capacity, and a tailored bound for a coloring-channel sequence forming a cycle. In particular, for an alphabet of size $4$, these results give the exact capacity of all coloring-channel sequences except for a cycle of length $4$, for which we only provide bounds.