Vincent (2022-03-31 11:11):
#paper doi: 10.1186/s13059-021-02443-7 Genome Biol 2021 Technology dictates algorithms: recent developments in read alignment. 序列比对是生物信息测序数据分析的基础步骤,这篇文章详细回顾了107种序列比对软件,并且通过实验评估了其中的11种软件的计算效率和速度。文章中提到序列比对算法和测序技术是共同进化的(co-evolution),一种新技术的诞生能带来了一系列工具的开发,而底层的核心算法往往没有很大的革命性的改变(只不过是tailored for the new technology)。文章调查发现基于哈希表index基因组的方法是最常见的,但是缺点是对存储空间的要求较大,基于suffix-tree的index方法往往计算速度也较快并且被越来越广泛的使用。另一方面,文章也发现,局部序列比对方法通常使用海明距离(hamming distance)和smith-waterman算法来寻找测序片段在基因组中的确切位置。此外文章还回顾了长序列读长对序列比对方法开发的影响等等。
IF:10.100Q1 Genome biology, 2021-08-26. DOI: 10.1186/s13059-021-02443-7 PMID: 34446078
Technology dictates algorithms: recent developments in read alignment
翻译
Abstract:
Aligning sequencing reads onto a reference is an essential step of the majority of genomic analysis pipelines. Computational algorithms for read alignment have evolved in accordance with technological advances, leading to today's diverse array of alignment methods. We provide a systematic survey of algorithmic foundations and methodologies across 107 alignment methods, for both short and long reads. We provide a rigorous experimental evaluation of 11 read aligners to demonstrate the effect of these underlying algorithms on speed and efficiency of read alignment. We discuss how general alignment algorithms have been tailored to the specific needs of various domains in biology.
翻译
回到顶部