颜林林
(2022-07-08 07:19):
#paper doi:10.1038/s41540-022-00233-w npj Systems Biology and Applications, Adaptive coding for DNA storage with high storage density and low coverage. 基于生物大分子(如DNA)实现大规模数据存储功能,是我个人比较感兴趣的方向之一。这几年在这个领域突然涌现了许多优秀文章,这可能与高通量测序技术发展,以及相关的合成生物学的进步有关。这篇来自大连理工的文章,也正是这样一个案例。本文提出了一种自适应编码DNA存储系统,针对不同的编码区域位置采用不同的编码方案,将 698 KB 大小的图像、视频和 PDF 文件存储在 DNA 中,之后又将其无损地解码还原为原始数据。相比过去同类工作,本文在编码数据过程中,更细致地设计了各种DNA分子特性及约束,使在保持碱基平衡和避免非特异性杂交的同时,能在尽量低测序深度下,对测序错误的噪声进行容错。将原始内容打散并接上索引片段,从而使所存储的内容可以通过特异性扩增并测序的方式进行随机读取。比较可惜的是,本文只做了理论上的模拟和探讨,尚未开展实际的DNA合成和测序,这大大削弱了文章的说服力。
IF:3.500Q1
NPJ systems biology and applications,
2022-07-04.
DOI: 10.1038/s41540-022-00233-w
PMID: 35788589
Adaptive coding for DNA storage with high storage density and low coverage
翻译
Abstract:
The rapid development of information technology has generated substantial data, which urgently requires new storage media and storage methods. DNA, as a storage medium with high density, high durability, and ultra-long storage time characteristics, is promising as a potential solution. However, DNA storage is still in its infancy and suffers from low space utilization of DNA strands, high read coverage, and poor coding coupling. Therefore, in this work, an adaptive coding DNA storage system is proposed to use different coding schemes for different coding region locations, and the method of adaptively generating coding constraint thresholds is used to optimize at the system level to ensure the efficient operation of each link. Images, videos, and PDF files of size 698 KB were stored in DNA using adaptive coding algorithms. The data were sequenced and losslessly decoded into raw data. Compared with previous work, the DNA storage system implemented by adaptive coding proposed in this paper has high storage density and low read coverage, which promotes the development of carbon-based storage systems.
翻译