文献收藏与分享平台

颜林林 (2022-06-01 07:41):

#paper doi:10.1101/2022.05.29.493900 bioRxiv 2022, Cost-efficient whole genome-sequencing using novel mostly natural sequencing-by-synthesis chemistry and open fluidics platform. 这是来自MIT的一家创业公司Ultima Genomics的新作品，它从设计原理上对当前“边合成边测序”的方法进行突破创新。通过在圆形大晶片上设计流控和光学系统，使相应的试剂耗材更加便宜。相对于Illumina测序在每个cycle进行可逆阻断的碱基追加方法，本文通过使用非阻断的方法，使碱基追加过程更加快速，同时配合一套CNN算法，来实现准确的base calling。实测下来，该测序方法可以做到在20小时以内、300bp长读长、Q30>85%高质量的高通量测序，且每Gb数据成本低于1美元。本文还使用GIAB及千人基因组的样本进行了基准测试，验证了测序结果的准确度。我们很多人天天都在围绕高通量测序做研究，早已把Illumina测序原理当做习以为常且理所当然的技术，默认了它的垄断和天花板地位，很少去考虑它还有什么可以进一步改善的地方。这篇文章是个拓展这方面眼界的机会。

bioRxiv, 2022. DOI: 10.1101/2022.05.29.493900

Cost-efficient whole genome-sequencing using novel mostly natural sequencing-by-synthesis chemistry and open fluidics platform

翻译

Gilad Almogy, Mark Pratt, Florian Oberstrass, Linda Lee, Dan Mazur, Nate Beckett, Omer Barad, Ilya Soifer, Eddie Perelman, Yoav Etzioni, Martin Sosa, April Jung, Tyson Clark, Gila Lithwick-Yanai, Sarah Pollock, Gil Hornung, Maya Levy, Matthew Coole, Tom Howd, Megan Shand, Yossi Farjoun, James Emery, Giles Hall, Samuel K Lee, Takuto Sato, Ricky Magner, Sophie Low, Andrew Bernier, Bharathi Gandi, Jack Stohlman, Corey Nolet, Siobhan Donovan, Brendan Blumenstiel, Michelle Cipicchio, Sheila Dodge, Eric Banks, Niall Lennon, Stacey Gabriel, Doron Lipson

Abstract:

We introduce a massively parallel novel sequencing platform that combines an open flow cell design on a circular wafer with a large surface area and mostly natural nucleotides that allow optical end-point detection without reversible terminators. This platform enables sequencing billions of reads with longer read length (~300bp) and fast runs times (<20hrs) with high base accuracy (Q30 > 85%), at a low cost of $1/Gb. We establish system performance by whole-genome sequencing of the Genome-In-A-Bottle reference samples HG001-7, demonstrating high accuracy for SNPs (99.6%) and Indels in homopolymers up to length 10 (96.4%) across the vast majority (>98%) of the defined high-confidence regions of these samples. We demonstrate scalability of the whole-genome sequencing workflow by sequencing an additional 224 selected samples from the 1000 Genomes project achieving high concordance with reference data.

翻译