颜林林
(2022-09-15 22:35):
#paper doi:10.1002/humu.24455 Human Mutation, 2022, de novo variant calling identifies cancer mutation signatures in the 1000 Genomes Project. 本文开发了一种能利用GPU加速、基于trio(一家三口,父母两人及一个子女)全基因组测序数据、检测新发突变(de novo variant)的工具。并使用该工具重新分析了三个大规模trio人群数据,三个人群分别是Simons Simplex Collection(SSC)、Simons Foundation Powering Autism Research(SPARK)和千人基因组(1000 Genomes Project,1000G),其样本类型分别为外周血、唾液和细胞系。结果发现细胞系的新发突变数量和特征,明显不符合预期。通过对1000G中的这些新发突变的特征分析,发现它们与B细胞淋巴瘤相似,从而推断其大多应为细胞系制备过程(即EBV处理)中引入的artifacts。
de novo variant calling identifies cancer mutation signatures in the 1000 Genomes Project
翻译
Abstract:
Detection of de novo variants (DNVs) is critical for studies of disease-related variation and mutation rates. To accelerate DNV calling, we developed a graphics processing units-based workflow. We applied our workflow to whole-genome sequencing data from three parent-child sequenced cohorts including the Simons Simplex Collection (SSC), Simons Foundation Powering Autism Research (SPARK), and the 1000 Genomes Project (1000G) that were sequenced using DNA from blood, saliva, and lymphoblastoid cell lines (LCLs), respectively. The SSC and SPARK DNV callsets were within expectations for number of DNVs, percent at CpG sites, phasing to the paternal chromosome of origin, and average allele balance. However, the 1000G DNV callset was not within expectations and contained excessive DNVs that are likely cell line artifacts. Mutation signature analysis revealed 30% of 1000G DNV signatures matched B-cell lymphoma. Furthermore, we found variants in DNA repair genes and at Clinvar pathogenic or likely-pathogenic sites and significant excess of protein-coding DNVs in IGLL5; a gene known to be involved in B-cell lymphomas. Our study provides a new rapid DNV caller for the field and elucidates important implications of using sequencing data from LCLs for reference building and disease-related projects.
翻译
Keywords: