颜林林
(2022-07-13 00:46):
#paper doi:10.1093/bib/bbac221 Briefings in Bioinformatics, 2022, A comprehensive benchmarking of WGS-based deletion structural variant callers. 这是一篇工具比较的方法学文章,针对基于全基因组测序数据鉴定结构变异(SV,structural variant)的工具,而且仅限定缺失(deletion)类型的SV。文章使用了瓶中基因组(genome-in-a-bottle)的结构变异集合,以及经PCR实验进行过验证的小鼠模型的结构变异集合,作为金标准,以便准确计算出每个工具的灵敏度、特异度等性能指标。评价结果反映了过去类似工作的表现:不同工具的表现之间的确差异很大,也确有一些工具在平衡灵敏度和特异度时表现不错。最终文章给出了相应的建议,即针对不同长度的缺失类型结构变异,相应推荐使用的工具。本文中规中矩,做得也算细致。比较有意思的是,在SV工具选择时的吐槽:排除需要配对样本的工具、排除只能检测很小片段变异的工具、排除仅支持长读长测序数据的工具,最终筛选出61个合适的工具,然而测试只使用了15或14个(分别针对小鼠和人的数据),只因为:其他工具都装不上!我个人也深有同感,姑且不说那些不舍得开放源码提供他人使用者,即使开源的,很多工具也并不容易被正常使用起来,需要阅读其源码并手工debug才能用起来的工具,并不罕见。
IF:6.800Q1
Briefings in bioinformatics,
2022-07-18.
DOI: 10.1093/bib/bbac221
PMID: 35753701
PMCID:PMC9294411
A comprehensive benchmarking of WGS-based deletion structural variant callers
翻译
Abstract:
Advances in whole-genome sequencing (WGS) promise to enable the accurate and comprehensive structural variant (SV) discovery. Dissecting SVs from WGS data presents a substantial number of challenges and a plethora of SV detection methods have been developed. Currently, evidence that investigators can use to select appropriate SV detection tools is lacking. In this article, we have evaluated the performance of SV detection tools on mouse and human WGS data using a comprehensive polymerase chain reaction-confirmed gold standard set of SVs and the genome-in-a-bottle variant set, respectively. In contrast to the previous benchmarking studies, our gold standard dataset included a complete set of SVs allowing us to report both precision and sensitivity rates of the SV detection methods. Our study investigates the ability of the methods to detect deletions, thus providing an optimistic estimate of SV detection performance as the SV detection methods that fail to detect deletions are likely to miss more complex SVs. We found that SV detection tools varied widely in their performance, with several methods providing a good balance between sensitivity and precision. Additionally, we have determined the SV callers best suited for low- and ultralow-pass sequencing data as well as for different deletion length categories.
翻译
Keywords: