笑对人生
(2022-10-05 00:01):
#paper doi: 10.1186/s13059-016-0893-4. DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution. Genome Biol. 2016 Feb 22;17:31.
突变信号(或突变特征)(mutational signature)首次提出来自Alexandrov LB, et al. Nature, 2013.的一项研究,当时利用非负矩阵分解(Non-negative matrix factorization,NMF)算法共发现21种mutational signature,每个signature包含96种不同三核苷酸突变(96 trinucleotide contexts)。最近来自science的研究报道了58种未被识别的mutational signature(Degasperi A, Science. 2022.)。与以往的研究相比,本研究开发的deconstructSigs包能够对单个肿瘤样本分析由环境暴露、DNA损伤修复异常和诱变等引起的突变信号。目前cosmic网站(https://cancer.sanger.ac.uk/signatures/)已经根据不同变异类型分成四大类signatures,分别是SBS Signature(Single base substitutions,95种亚signature)、DBS Signature(Doublet Base Substitution,11种亚signature)、ID Signatures(Small insertions and deletions,18种亚signature)和CN Signatures(Copy Number Variantions,24种亚signature)。deconstructSigs包的分析步骤包括(1)利用mut.to.sigs.input构建输入文件。(2)利用whichSignatures进行Signature 预测。这里提到的NMF是一种用于发现数据特征的算法,之前在图像识别领域很常用,较其他PCA或SVD等算法相比,保证了矩阵元素为非负(在大多数应用场景种负值元素大多数是无意义的)。NMF的基本思想是对于任意给定的一个非负矩阵V,其能够寻找到一个非负矩阵W和一个非负矩阵H,满足条件V=W*H,从而将一个非负的矩阵分解为左右两个非负矩阵的乘积。V分解为矩阵W和H的过程需要不断地迭代,直至矩阵W和H收敛才停止。V矩阵中每一列代表一个观测(observation),每一行代表一个特征(feature),比如RNAseq的样本(列)和基因(行)的表达矩阵;W矩阵称为基矩阵(行列式的值不等于0,就是基矩阵),H矩阵称为系数矩阵或权重矩阵。这时用系数矩阵H代替原始矩阵,就可以实现对原始矩阵进行降维,得到数据特征的降维矩阵,从而减少存储空间。
DeconstructSigs: delineating mutational processes in single tumors distinguishes DNA repair deficiencies and patterns of carcinoma evolution
翻译
Abstract:
BACKGROUND: Analysis of somatic mutations provides insight into the mutational processes that have shaped the cancer genome, but such analysis currently requires large cohorts. We develop deconstructSigs, which allows the identification of mutational signatures within a single tumor sample.RESULTS: Application of deconstructSigs identifies samples with DNA repair deficiencies and reveals distinct and dynamic mutational processes molding the cancer genome in esophageal adenocarcinoma compared to squamous cell carcinomas.CONCLUSIONS: deconstructSigs confers the ability to define mutational processes driven by environmental exposures, DNA repair abnormalities, and mutagenic processes in individual tumors with implications for precision cancer medicine.
翻译