Random Access Expectation in DNA Storage and Fountain Codes
Provides theoretical bounds for random access efficiency in DNA storage, an incremental advance for coding theory applied to a specific domain.
The paper studies the expected number of coded symbols needed to decode a single information symbol in DNA storage, showing that for binary fully symmetric codes (equivalent to LT codes) the normalized random access expectation is at least π/4 ≈ 0.7854 and achievable at ≈0.7869.
Motivated by DNA data storage, we study the expected number of coded symbols drawn from a linear code until a desired information symbol can be decoded - the random access expectation. We focus on generator matrices with a type of symmetry, conjectured in prior work to be optimal, which we call fully symmetric. We point out an equivalence between binary fully symmetric codes and LT codes. Using this observation, we analyze the random access expectation of binary fully symmetric codes under a peeling decoder, in the large blocklength limit. Under these assumptions, the random access expectation, normalized by the number of information symbols, is at least π/4 {\approx} 0.7854, while a value of {\approx} 0.7869 is achievable.