ITCOMar 6

The DNA Coverage Depth Problem: Duality, Weight Distributions, and Applications

arXiv:2603.06489v11 citations
Predicted impact top 93% in IT · last 90 daysOriginality Incremental advance
AI Analysis

This work addresses a practical problem in DNA data storage for scenarios using structured codes over small fields, representing an incremental advance with specific applications.

The paper tackles the DNA coverage depth problem, which involves computing the expected number of reads needed to recover all encoded strands in DNA data storage, by developing combinatorial tools based on duality and extended weight enumerators to derive closed formulas for specific linear codes like simplex and Hamming codes.

The coverage depth problem in DNA data storage is about computing the expected number of reads needed to recover all encoded strands. Given a generator matrix of a linear code, this quantity equals the expected number of randomly drawn columns required to obtain full rank. While MDS codes are optimal when they exist, i.e., over large fields, practical scenarios may rely on structured code families defined over small fields. In this work, we develop combinatorial tools to solve the DNA coverage depth problem for various linear codes, based on duality arguments and the notion of extended weight enumerator. Using these methods, we derive closed formulas for the simplex, Hamming, ternary Golay, extended ternary Golay, and first-order Reed-Muller codes. The centerpiece of this paper is a general expression for the coverage depth of a linear code in terms of the weight distributions of its higher-field extensions.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes