Image Storage on Synthetic DNA Using Autoencoders
This work addresses the need for efficient and error-resistant data storage solutions for cold data, representing an incremental advancement in adapting machine learning methods to DNA storage challenges.
The paper tackles the problem of storing images on synthetic DNA by developing convolutional autoencoders for lossy compression and encoding into quaternary code, with results showing improved robustness to substitution errors through a noise model during training.
Over the past years, the ever-growing trend on data storage demand, more specifically for "cold" data (rarely accessed data), has motivated research for alternative systems of data storage. Because of its biochemical characteristics, synthetic DNA molecules are now considered as serious candidates for this new kind of storage. This paper presents some results on lossy image compression methods based on convolutional autoencoders adapted to DNA data storage. The model architectures presented here have been designed to efficiently compress images, encode them into a quaternary code, and finally store them into synthetic DNA molecules. This work also aims at making the compression models better fit the problematics that we encounter when storing data into DNA, namely the fact that the DNA writing, storing and reading methods are error prone processes. The main take away of this kind of compressive autoencoder is our quantization and the robustness to substitution errors thanks to the noise model that we use during training.