LGITFeb 28, 2025

Efficient Transformer-based Decoder for Varshamov-Tenengolts Codes

arXiv:2502.21060v1h-index: 2
Originality Incremental advance
AI Analysis

This addresses the challenge of error correction in DNA data storage, offering an incremental improvement over existing methods for multiple errors.

The paper tackled the problem of correcting multiple insertion, deletion, and substitution errors in DNA data storage using Varshamov-Tenengolts codes, achieving perfect single-error correction and significantly improved bit and frame error rates for multiple errors, with a tenfold reduction in time consumption compared to other soft decoders.

In recent years, the rise of DNA data storage technology has brought significant attention to the challenge of correcting insertion, deletion, and substitution (IDS) errors. Among various coding methods for IDS correction, Varshamov-Tenengolts (VT) codes, primarily designed for single-error correction, have emerged as a central research focus. While existing decoding methods achieve high accuracy in correcting a single error, they often fail to correct multiple IDS errors. In this work, we observe that VT codes retain some capability for addressing multiple errors by introducing a transformer-based VT decoder (TVTD) along with symbol- and statistic-based codeword embedding. Experimental results demonstrate that the proposed TVTD achieves perfect correction of a single error. Furthermore, when decoding multiple errors across various codeword lengths, the bit error rate and frame error rate are significantly improved compared to existing hard decision and soft-in soft-out algorithms. Additionally, through model architecture optimization, the proposed method reduces time consumption by an order of magnitude compared to other soft decoders.

Foundations

The foundational work for this paper's niche, ranked by how specifically the neighbourhood builds on it — not by global fame.

Your Notes