SynDe: Syndrome-guided Decoding of Raw Nanopore Reads
This work addresses error correction in DNA data storage, offering a more flexible and efficient solution for researchers in genomics and data storage, though it is incremental as it builds on prior convolutional code methods.
The authors tackled the problem of high error rates in nanopore sequencing for DNA data storage by proposing SynDe, a syndrome-guided decoder that supports any linear error correction code with low complexity, achieving performance comparable to or better than existing algorithms while reducing time complexity.
Nanopore sequencing technology remains highly error-prone, making efficient error correction essential in DNA-based data storage. Prior work addressed high error rates using convolutional codes with their decoder coupled with the basecaller, but such approaches only accommodate a limited number of code classes and incur significant decoding complexity. To overcome these limitations, we propose two algorithms: PrimerSeeker, which efficiently detects primer sequences in raw nanopore sequencing reads, and SynDe, a decoder that operates on the same raw reads and supports any linear error correction code with a low-complexity graphical representation. PrimerSeeker provides primer location estimates close to those of existing approaches while being better suited for real-time primer detection during sequencing. SynDe performs well with convolutional codes augmented with periodic markers, often approaching or exceeding the performance of existing algorithms with a lower time complexity. Remarkably, the confidence scores produced by SynDe reliably identify which of its outputs should be discarded.