来自杂志 Nature communications 的文献。
当前共找到 36 篇文献分享,本页显示第 21 - 36 篇。
21.
小W
(2023-01-31 23:13):
#paper doi:https://doi.org/10.1038/s41467-022-35320-3
Tumor fractions deciphered from circulating cell-free DNA methylation for cancer early diagnosis. Nat Commun 13, 7694 (2022)
本文是清华大学团队开发的使用 cfDNA 甲基化特征来构建SRFD-Bayes诊断模型,通过去卷积混合甲基化特征来估计cfDNA的肿瘤的起源组织 (TOO),用于预测原发性肿瘤的位置和对癌症早期诊断。本文分为三个部分,使用肿瘤和正常样本甲基化数据模拟 cfdna 数据;甲基化标记物选择,使用半参考反卷积(SRFD)从血浆cfDNA甲基化谱中学习的参考数据库, 构建SRFD-Bayes 模型;在早期患者和健康个体上验证时,该模型对癌症早期检测的敏感性为86.1%,对肿瘤定位的平均准确性为76.9%,特异性为94.7%。
Abstract:
Tumor-derived circulating cell-free DNA (cfDNA) provides critical clues for cancer early diagnosis, yet it often suffers from low sensitivity. Here, we present a cancer early diagnosis approach using tumor fractions …
>>>
Tumor-derived circulating cell-free DNA (cfDNA) provides critical clues for cancer early diagnosis, yet it often suffers from low sensitivity. Here, we present a cancer early diagnosis approach using tumor fractions deciphered from circulating cfDNA methylation signatures. We show that the estimated fractions of tumor-derived cfDNA from cancer patients increase significantly as cancer progresses in two independent datasets. Employing the predicted tumor fractions, we establish a Bayesian diagnostic model in which training samples are only derived from late-stage patients and healthy individuals. When validated on early-stage patients and healthy individuals, this model exhibits a sensitivity of 86.1% for cancer early detection and an average accuracy of 76.9% for tumor localization at a specificity of 94.7%. By highlighting the potential of tumor fractions on cancer early diagnosis, our approach can be further applied to cancer screening and tumor progression monitoring.
<<<
翻译
22.
小擎子
(2023-01-31 23:12):
#paper doi:10.1038/s41467-022-35237-x Nat Commun., 2022, Sourcing thermotolerant poly(ethylene terephthalate) hydrolase scaffolds from natural diversity。将HMM方法与机器学习相结合,以鉴定PET水解酶,根据序列预测酶的最佳活性温度。从公开数据库获得序列和环境最佳生长温度(OGT),只保留OGT大于50℃的序列,对于没有OGT信息的序列,使用计算氨基酸特征的支持向量机方法训练机器学习模型(ThermoProt)来区分来自嗜热菌大于50℃的8000种蛋白和来自非嗜热菌的小于50℃的8000种蛋白。ThremProt表现出86.6%的准确率。选择了74种假定的耐热PET水解酶进行实验筛选。实验筛选出了23种热稳定酶,均未被报道,并且超过先前报道的36种酶的PET水解酶活性。
Abstract:
Enzymatic deconstruction of poly(ethylene terephthalate) (PET) is under intense investigation, given the ability of hydrolase enzymes to depolymerize PET to its constituent monomers near the polymer glass transition temperature. To …
>>>
Enzymatic deconstruction of poly(ethylene terephthalate) (PET) is under intense investigation, given the ability of hydrolase enzymes to depolymerize PET to its constituent monomers near the polymer glass transition temperature. To date, reported PET hydrolases have been sourced from a relatively narrow sequence space. Here, we identify additional PET-active biocatalysts from natural diversity by using bioinformatics and machine learning to mine 74 putative thermotolerant PET hydrolases. We successfully express, purify, and assay 51 enzymes from seven distinct phylogenetic groups; observing PET hydrolysis activity on amorphous PET film from 37 enzymes in reactions spanning pH from 4.5-9.0 and temperatures from 30-70 °C. We conduct PET hydrolysis time-course reactions with the best-performing enzymes, where we observe differences in substrate selectivity as function of PET morphology. We employed X-ray crystallography and AlphaFold to examine the enzyme architectures of all 74 candidates, revealing protein folds and accessory domains not previously associated with PET deconstruction. Overall, this study expands the number and diversity of thermotolerant scaffolds for enzymatic PET deconstruction.
<<<
翻译
23.
李翛然
(2023-01-30 16:03):
#paper doi:https://doi.org/10.1038/s41467-022-35343-w Machine learning models to accelerate the design of polymeric long-acting injectables
2023年第一篇吸引我注意的计算生物学的论文。 这篇文章刚好提到我们最近的一个研究方向,不错不错,说明我司都踏在点子上了。 这篇文章主要是介绍了一种如何通过计算来设计长效药物结构的方法。虽然看内容,里面的计算工具和思想还是AI从业人员不难想到,通过AI学习长效药物的特征,从而预测新的药物结构释放效率。 但是揭示的结论确实和我司考虑的方向一模一样。 人类历史上很多药物都是马马虎虎上市的,有太多可以改进的地方了。 加油2023
Abstract:
Long-acting injectables are considered one of the most promising therapeutic strategies for the treatment of chronic diseases as they can afford improved therapeutic efficacy, safety, and patient compliance. The use …
>>>
Long-acting injectables are considered one of the most promising therapeutic strategies for the treatment of chronic diseases as they can afford improved therapeutic efficacy, safety, and patient compliance. The use of polymer materials in such a drug formulation strategy can offer unparalleled diversity owing to the ability to synthesize materials with a wide range of properties. However, the interplay between multiple parameters, including the physicochemical properties of the drug and polymer, make it very difficult to intuitively predict the performance of these systems. This necessitates the development and characterization of a wide array of formulation candidates through extensive and time-consuming in vitro experimentation. Machine learning is enabling leap-step advances in a number of fields including drug discovery and materials science. The current study takes a critical step towards data-driven drug formulation development with an emphasis on long-acting injectables. Here we show that machine learning algorithms can be used to predict experimental drug release from these advanced drug delivery systems. We also demonstrate that these trained models can be used to guide the design of new long acting injectables. The implementation of the described data-driven approach has the potential to reduce the time and cost associated with drug formulation development.
<<<
翻译
24.
大勇
(2022-12-31 23:13):
# paper Aversive memory formation in humans involves an amygdala-hippocampus phase code,2022,nature communication,https://doi.org/10.1038/s41467-022-33828-2 我们对于情绪性事件一般都会有一个更深刻的记忆,这一机制被认为是由于杏仁核调节了海马活动而导致的,然而这两个脑区间是如何交互的,其又是通过怎样一种神经动态的机制来影响记忆的并不清楚,本文作者利用颅内记录,发现成功编码的情绪记忆会伴随杏仁核theta相位与海马gamma振荡及神经元放电的耦合,随后记得和不记得的情绪刺激之间的相位差转化为一个时间段,形成了杏仁核和下游海马伽马之间的一致性滞后。这些结果揭示了一种机制,杏仁核 theta 相位协调瞬态杏仁核-海马伽马相干性以促进厌恶记忆编码。杏仁核可以传递情绪记忆的内容到其他脑区从而调节其他认知功能。
Abstract:
Memory for aversive events is central to survival but can become maladaptive in psychiatric disorders. Memory enhancement for emotional events is thought to depend on amygdala modulation of hippocampal activity. …
>>>
Memory for aversive events is central to survival but can become maladaptive in psychiatric disorders. Memory enhancement for emotional events is thought to depend on amygdala modulation of hippocampal activity. However, the neural dynamics of amygdala-hippocampal communication during emotional memory encoding remain unknown. Using simultaneous intracranial recordings from both structures in human patients, here we show that successful emotional memory encoding depends on the amygdala theta phase to which hippocampal gamma activity and neuronal firing couple. The phase difference between subsequently remembered vs. not-remembered emotional stimuli translates to a time period that enables lagged coherence between amygdala and downstream hippocampal gamma. These results reveal a mechanism whereby amygdala theta phase coordinates transient amygdala -hippocampal gamma coherence to facilitate aversive memory encoding. Pacing of lagged gamma coherence via amygdala theta phase may represent a general mechanism through which the amygdala relays emotional content to distant brain regions to modulate other aspects of cognition, such as attention and decision-making.
<<<
翻译
25.
Vincent
(2022-11-30 19:09):
#paper https://doi.org/10.1038/s41467-020-15298-6 nature communication, 2020, Integrative differential expression and gene set enrichment analysis using summary statistics for scRNA-seq studies. 基因表达差异分析和基因集富集分析是单细胞领域两个最常用的分析方式,但是两种分析往往是独立进行的,由于单细胞数据噪声较大,这样单独分析会造成统计效力的降低以及不同的数据集(或者使用不同方法分析同一套数据)得到的分析结果不一致。另一方面差异分析和富集分析其实在内部是紧密相连的,差异分析的结果是富集分析的基础,同时基因集富集分析反过来也可以反哺差异分析(基因之间并非独立,如果某基因差异表达了,与之相关的基因也可能差异表达),这意味着将两者结合起来同时分析能够提高统计效力并且使得分析结果更加稳健和可重复。这篇文章提出了一种新方法iDEA,该方法使用了层次贝叶斯模型,将差异分析和富集分析整合起来综合分析,通过仿真实验和真实数据分析,文章发现该方法较现有的差异或者富集方法有更高的统计效力,更一致的差异分析结果和更准确的富集分析结论
Abstract:
Differential expression (DE) analysis and gene set enrichment (GSE) analysis are commonly applied in single cell RNA sequencing (scRNA-seq) studies. Here, we develop an integrative and scalable computational method, iDEA, …
>>>
Differential expression (DE) analysis and gene set enrichment (GSE) analysis are commonly applied in single cell RNA sequencing (scRNA-seq) studies. Here, we develop an integrative and scalable computational method, iDEA, to perform joint DE and GSE analysis through a hierarchical Bayesian framework. By integrating DE and GSE analyses, iDEA can improve the power and consistency of DE analysis and the accuracy of GSE analysis. Importantly, iDEA uses only DE summary statistics as input, enabling effective data modeling through complementing and pairing with various existing DE methods. We illustrate the benefits of iDEA with extensive simulations. We also apply iDEA to analyze three scRNA-seq data sets, where iDEA achieves up to five-fold power gain over existing GSE methods and up to 64% power gain over existing DE methods. The power gain brought by iDEA allows us to identify many pathways that would not be identified by existing approaches in these data.
<<<
翻译
26.
小年
(2022-11-30 18:38):
https://doi.org/10.1038/s41467-021-21865-2 nature communications, 2021, Single cell transcriptomic analysis of murine lung development on hyperoxia-induced damage. 本文构建高氧损伤小鼠肺模型模拟支气管肺发育不良,进行单细胞转录组测序分析研究,评估小鼠肺单细胞发育动态。按3个时间节点,捕获了36只小鼠肺的超66,000个单细胞。分别从肺泡上皮、基质成纤维细胞、毛细血管内皮和巨噬细胞等亚群方面阐述肺损伤小鼠随着时间发育在细胞数目和基因层面的变化,通路分析和细胞动态串扰预测表明炎症信号是高氧诱导变化的主要驱动因素。本文提供了一个较广泛的健康小鼠和肺受损小鼠发育过程中的细胞组成图谱,但细胞类型较为受限,更为精细的细胞亚群注释依赖于亚群marker和细胞形态学认识的提升。
Abstract:
During late lung development, alveolar and microvascular development is finalized to enable sufficient gas exchange. Impaired late lung development manifests as bronchopulmonary dysplasia (BPD) in preterm infants. Single-cell RNA sequencing …
>>>
During late lung development, alveolar and microvascular development is finalized to enable sufficient gas exchange. Impaired late lung development manifests as bronchopulmonary dysplasia (BPD) in preterm infants. Single-cell RNA sequencing (scRNA-seq) allows for assessment of complex cellular dynamics during biological processes, such as development. Here, we use MULTI-seq to generate scRNA-seq profiles of over 66,000 cells from 36 mice during normal or impaired lung development secondary to hyperoxia with validation of some of the findings in lungs from BPD patients. We observe dynamic populations of cells, including several rare cell types and putative progenitors. Hyperoxia exposure, which mimics the BPD phenotype, alters the composition of all cellular compartments, particularly alveolar epithelium, stromal fibroblasts, capillary endothelium and macrophage populations. Pathway analysis and predicted dynamic cellular crosstalk suggest inflammatory signaling as the main driver of hyperoxia-induced changes. Our data provides a single-cell view of cellular changes associated with late lung development in health and disease.
<<<
翻译
27.
李翛然
(2022-10-31 09:48):
#paper TET1 is a beige adipocyte-selective epigenetic suppressor of thermogenesis doi: https://doi.org/10.1038/s41467-020-18054-y
关于TET1 ,文献报道Tumor suppressor应该是没有问题,做为重组蛋白治疗肿瘤,我接下来要调研一下临床上哪类肿瘤病人是否有TET1缺失的现象,由此来判断肿瘤是否在TET1不缺失的情况下不好生长,确定其临床价值,还有一个要考虑的是这2篇文章介绍的TET1压抑脂肪细胞热能量代谢,维他命C作用在TET1压制somatic cell reprogramming,这2个现象是否可能导致严重的副作用,限制TET1的剂量
Abstract:
It has been suggested that beige fat thermogenesis is tightly controlled by epigenetic regulators that sense environmental cues such as temperature. Here, we report that subcutaneous adipose expression of the …
>>>
It has been suggested that beige fat thermogenesis is tightly controlled by epigenetic regulators that sense environmental cues such as temperature. Here, we report that subcutaneous adipose expression of the DNA demethylase TET1 is suppressed by cold and other stimulators of beige adipocyte thermogenesis. TET1 acts as an autonomous repressor of key thermogenic genes, including Ucp1 and Ppargc1a, in beige adipocytes. Adipose-selective Tet1 knockout mice generated by using Fabp4-Cre improves cold tolerance and increases energy expenditure and protects against diet-induced obesity and insulin resistance. Moreover, the suppressive role of TET1 in the thermogenic gene regulation of beige adipocytes is largely DNA demethylase-independent. Rather, TET1 coordinates with HDAC1 to mediate the epigenetic changes to suppress thermogenic gene transcription. Taken together, TET1 is a potent beige-selective epigenetic breaker of the thermogenic gene program. Our findings may lead to a therapeutic strategy to increase energy expenditure in obesity and related metabolic disorders.
<<<
翻译
28.
洪媛媛
(2022-10-30 12:16):
#paper https://doi.org/10.1038/s41467-022-32995-6 nature communications 2022. Cost-effective methylome sequencing of cell-free DNA for accurately detecting and locating cancer. 这篇文章介绍了一种富集cfDNA CpG区域的NGS建库方法(cfMethyl-Seq),cfMethyl-Seq比全基因组甲基化测序更节省数据量,而且比传统的RRBS方法更适合用于cfDNA CpG区域的富集。该研究首先通过RRBS测序的癌症、癌旁组织样本,以及cfMethyl-Seq测序的健康人血浆样本,筛选出癌症早筛和组织溯源(TOO)marker,然后将cfMethyl-Seq测序的217癌症和131健康人血浆样本,分成训练集和测试集,在训练集建模,在测试集验证性能。
Abstract:
Early cancer detection by cell-free DNA faces multiple challenges: low fraction of tumor cell-free DNA, molecular heterogeneity of cancer, and sample sizes that are not sufficient to reflect diverse patient …
>>>
Early cancer detection by cell-free DNA faces multiple challenges: low fraction of tumor cell-free DNA, molecular heterogeneity of cancer, and sample sizes that are not sufficient to reflect diverse patient populations. Here, we develop a cancer detection approach to address these challenges. It consists of an assay, cfMethyl-Seq, for cost-effective sequencing of the cell-free DNA methylome (with > 12-fold enrichment over whole genome bisulfite sequencing in CpG islands), and a computational method to extract methylation information and diagnose patients. Applying our approach to 408 colon, liver, lung, and stomach cancer patients and controls, at 97.9% specificity we achieve 80.7% and 74.5% sensitivity in detecting all-stage and early-stage cancer, and 89.1% and 85.0% accuracy for locating tissue-of-origin of all-stage and early-stage cancer, respectively. Our approach cost-effectively retains methylome profiles of cancer abnormalities, allowing us to learn new features and expand to other cancer types as training cohorts grow.
<<<
翻译
29.
笑对人生
(2022-10-07 22:00):
#paper doi: 10.1038/s41467-022-30033-z. Reference-free cell type deconvolution of multi-cellular pixel-resolution spatially resolved transcriptomics data. Nat Commun. 2022 Apr 29;13(1):2339.
空间转录组技术能够揭示组织内不同区域的细胞转录谱特征,对理解组织的细胞生物学功能具有重要意义。然而,目前空间转录组技术存在一定的局限性,一是基于测序的空间转录组技术分辨率较低,无法达到真正的单细胞水平,二是基于原位杂交或显微成像的空间转录组技术检测的RNA数量有限且价格昂贵。
为了解决上述的问题,科学家开发了一系列整合单细胞转录组数据和空间转录组的算法,用于预测多细胞空间分辨率(multi-cellular pixel-resolution)下的细胞类型和复原单个细胞的完整转录表达谱。SPOTlight主要是利用来自单细胞转录组数据(scRNA-seq)的细胞类型标记基因矩阵,基于种子非负向矩阵分解方法对空间转录组的捕获位置(spot)进行细胞类型去卷积。RCTD需要利用scRNA-seq中每种细胞类型所有marker基因的表达均值作为参考数据的输入,用于建立能够反映spot内每种细胞贡献的概率统计模型,进而预测细胞类型及其比例。SpatialDWLS首先使用来自scRNAseq的细胞类型特征基因去做GSEA富集,然后利用阻尼最小二乘法(dampened weighted least squares)算法推断spot的细胞类型组成。然而,以上的这些方法均依赖于合适的scRNAseq数据,受成本、技术和生物学差异等因素的影响较大。尽管目前已公布了众多的健康人器官或组织图谱文章,但也可能存在批次效应和异质性问题。此外,基于液滴的scRNAseq需要对组织进行解离和捕获,可能会导致scRNAseq鉴定细胞类型和空间转录组不一致的问题。基于以上种种原因,有必要开发一种无需参考数据的spot细胞类型解卷积方法。
STdeconvolve是一个无需单细胞参考数据即可对空间转录组数据进行细胞类型反卷积的软件包。STdeconvolve的核心算法是隐狄利克雷分配模型(Latent Dirichlet Allocation,LDA)。LDA是自然语言处理中被普遍使用的一种统计模型,可以用于发现文档集(documents)中潜在的主题(latent topics),并最终以概率分布的形式输出。当LDA应用到空间转录组数据时,则以多细胞空间分辨率下的基因表达计数矩阵(count matrix)作为输入,进而推断每种细胞类型(主题)的转录表达谱和每种细胞类型的占比。无论是在模拟的ST数据,还是在不同分辨率的空间转录组数据(10X Visium、DBiT-seq和Slide-seq),STdeconolve都能够有效地复原组织内某一细胞类型的转录表达谱信息以及在原分辨率下的每种细胞占比。当存在匹配的单细胞参考数据集时,STdeconolve的细胞类型反卷积性能与其他依赖参考数据的软件相当。而当缺乏匹配数据集时,STdeconolve的性能更优。文章中的性能评价指标是均方根误差(Root Mean Square Error,RMSE),RMSE可用于表示模型预测中产生的误差大小,一般来说,RMSE越小,表示模型的预测能力越好。
Abstract:
Recent technological advancements have enabled spatially resolved transcriptomic profiling but at multi-cellular pixel resolution, thereby hindering the identification of cell-type-specific spatial patterns and gene expression variation. To address this challenge, …
>>>
Recent technological advancements have enabled spatially resolved transcriptomic profiling but at multi-cellular pixel resolution, thereby hindering the identification of cell-type-specific spatial patterns and gene expression variation. To address this challenge, we develop STdeconvolve as a reference-free approach to deconvolve underlying cell types comprising such multi-cellular pixel resolution spatial transcriptomics (ST) datasets. Using simulated as well as real ST datasets from diverse spatial transcriptomics technologies comprising a variety of spatial resolutions such as Spatial Transcriptomics, 10X Visium, DBiT-seq, and Slide-seq, we show that STdeconvolve can effectively recover cell-type transcriptional profiles and their proportional representation within pixels without reliance on external single-cell transcriptomics references. STdeconvolve provides comparable performance to existing reference-based methods when suitable single-cell references are available, as well as potentially superior performance when suitable single-cell references are not available. STdeconvolve is available as an open-source R software package with the source code available at https://github.com/JEFworks-Lab/STdeconvolve .
<<<
翻译
30.
小年
(2022-08-31 10:19):
#paper doi.org/10.1038/s41467-021-24213-6 nature communications, 2021, Single-cell transcriptomic analysis reveals disparate effector differentiation pathways in human Treg compartment. 人类调节性 T细胞 (Treg) 是具有高度免疫抑制的一类CD4+ T细胞类群。本文通过对健康人的骨髓及外周血采样分选出Treg细胞,采用单细胞转录组、单细胞TCR技术构建了健康个体两个不同组织的Treg细胞图谱,并通过轨迹分析解析了Treg细胞两条不同功能的主要分化途径,辅以流式细胞仪分选验证。随后采用同样的技术对移植后aGVHD阴性和阳性的患者进行Treg解析,发现这两个分化途径及相应的细胞群体在移植患者中保守,尽管在aGVHD患者中存在一些功能和迁移障碍。这些发现扩大了对 Treg 细胞异质性和分化的理解,并为解剖 Treg 在健康和疾病中的复杂性提供了单细胞图谱参考。
Abstract:
Human FOXP3 regulatory T (T) cells are central to immune tolerance. However, their heterogeneity and differentiation remain incompletely understood. Here we use single-cell RNA and T cell receptor sequencing to …
>>>
Human FOXP3 regulatory T (T) cells are central to immune tolerance. However, their heterogeneity and differentiation remain incompletely understood. Here we use single-cell RNA and T cell receptor sequencing to resolve T cells from healthy individuals and patients with or without acute graft-versus-host disease (aGVHD) who undergo stem cell transplantation. These analyses, combined with functional assays, separate T cells into naïve, activated, and effector stages, and resolve the HLA-DR, LIMS1, highly suppressive FOXP3, and highly proliferative MKI67 effector subsets. Trajectory analysis assembles T subsets into two differentiation paths (I/II) with distinctive phenotypic and functional programs, ending with the FOXP3 and MKI67 subsets, respectively. Transcription factors FOXP3 and SUB1 contribute to some Path I and Path II phenotypes, respectively. These FOXP3 and MKI67 subsets and two differentiation pathways are conserved in transplanted patients, despite having functional and migratory impairments under aGVHD. These findings expand the understanding of T cell heterogeneity and differentiation and provide a single-cell atlas for the dissection of T complexity in health and disease.
<<<
翻译
31.
笑对人生
(2022-07-31 09:15):
#paper Phasing analysis of lung cancer genomes using a long read sequencer. Nat Commun. 2022 Jun 16;13(1):3464. doi: 10.1038/s41467-022-31133-6
背景知识:SNV(单核苷酸位点变异,single nucleotide variant)是指基因组上发生单碱基改变的位点。SNP(单核苷酸多态性,single nucleotide polymorphism)是指基因组上由单个核苷酸变异引起的DNA序列多态性。SNP描述的是个体基因组上发生碱基改变,而SNP更倾向于是一种群体属性。更加易懂的英文:A haplotype is a physical grouping of genomic variants (or polymorphisms) that tend to be inherited together. A specific haplotype typically reflects a unique combination of variants that reside near each other on a chromosome. 单倍型(Haplotype)是指位于一条染色体上某个区域,具有一定相关联等位变异位点的组合。一种组合就代表一种单倍型。对单倍型进行分型,判断变异是否来自同一条染色体的过程称为phasing(又称haplotype estimation)。这里提到的分型或变异,常常是经过比较后得出的结果,在群体遗传学中,这种比较可能是某个个体与群体其他人的比较,或子代和亲本之间的比较,讲述的是进化或变异的结果(自己理解)。等位基因(allele,又称allelomorph)一般指位于一对同源染色体(一条来自父本,一条来自母本)的相同位置上控制相同性状不同状态的一对基因。英文解释:Any one of a series of two or more different genes that may occupy the same locus on a specific chromosome;An allele is a variant form of a gene. 目前的二代或三代测序,测到的reads是来自同一条染色体,因此无法区分某一条序列来自父源还是母源。不过,相对于二代,三代测序可凭借长读长优势,能覆盖大部分相邻的单核苷酸多态性位点,实现更为准确的单倍型分型。另外,三代测序可精准检测拷贝数变异(copy number variant,CNV),以及在进行对序列进行定相的同时,携带甲基化等碱基修饰信息。
研究目的:既往对肿瘤内SNVs、indels和CNV的检测大多是基于二代测序。然而,二代测序因短读长的特点,无法对基因组上高GC、重复序列区域以及染色体大片段变异进行准确识别。因此,利用三代测序技术超长读长的优势,将有助于更加全面地揭示肿瘤内发生的变异事件。最近公布的ONT-Q20+测序技术,可实现>99%的原始reads(单链)准确度,或约Q30的双链(Duplex)准确度。本研究的研究目的就是利用ONT的nanopore技术对非小细胞肺癌进行组织和细胞层面的定相分析、拷贝数变异和染色体碎裂等研究。
研究方法:对来自20名非小细胞肺癌患者的正常组织同时进行二代和三代的全基因组测序,对肿瘤组织只进行三代测序。另外,利用测序中甲基化信号和基于ONT平台的全长转录组测序探究变异与表型的关系。为了进一步探究肿瘤细胞的克隆结构,还对2例样本完成了基于ONT平台的scDNAseq。
研究结果:本研究通过利用二代测序对三代测序数据进行校正,在N50长度超过834 kb定相区块中,实现与公开二代测序的WGS数据库一致性接近99%的SNVs检测。结合甲基化数据和全长转录组测序,仅在两个样本中发现定相区块的变异(单倍型变异)与甲基化修饰和转录调控存在相关。另外,对染色体大片段变异进行分析,发现EGFR突变阳性肺腺癌肿瘤组织存在特有的染色体碎裂事件,揭示了EGFR通路的异常可能会影响端粒酶活性。
Abstract:
Chromosomal backgrounds of cancerous mutations still remain elusive. Here, we conduct the phasing analysis of non-small cell lung cancer specimens of 20 Japanese patients. By the combinatory use of short …
>>>
Chromosomal backgrounds of cancerous mutations still remain elusive. Here, we conduct the phasing analysis of non-small cell lung cancer specimens of 20 Japanese patients. By the combinatory use of short and long read sequencing data, we obtain long phased blocks of 834 kb in N50 length with >99% concordance rate. By analyzing the obtained phasing information, we reveal that several cancer genomes harbor regions in which mutations are unevenly distributed to either of two haplotypes. Large-scale chromosomal rearrangement events, which resemble chromothripsis events but have smaller scales, occur on only one chromosome, and these events account for the observed biased distributions. Interestingly, the events are characteristic of EGFR mutation-positive lung adenocarcinomas. Further integration of long read epigenomic and transcriptomic data reveal that haploid chromosomes are not always at equivalent transcriptomic/epigenomic conditions. Distinct chromosomal backgrounds are responsible for later cancerous aberrations in a haplotype-specific manner.
<<<
翻译
32.
洪媛媛
(2022-07-29 14:23):
#paper https://doi.org/10.1038/s41467-022-31765-8 Nat Commun 13, 4248 (2022). Accurate somatic variant detection using weakly supervised deep learning。肿瘤体细胞突变的calling一般使用统计学方法结合过滤条件来确定。这篇文章使用一种命名为“VarNet" 的深度学习方法,利用配对的肿瘤和正常DNA数据来确定体细胞突变。VarNet利用已知突变和非突变答案的肿瘤DNA和它配对正常DNA序列信息,将每个位点的base, base quality, mapping quality, strand bias 和 the reference base信息形成多维矩阵来训练模型,预测每个位置存在突变的概率。接着又在4套publicly available benchmark datasets比较VarNet和另外4种已发表方法,calling突变的Precision和recall能力,证明VarNet优于现有的4种方法。
Abstract:
Identification of somatic mutations in tumor samples is commonly based on statistical methods in combination with heuristic filters. Here we develop VarNet, an end-to-end deep learning approach for identification of …
>>>
Identification of somatic mutations in tumor samples is commonly based on statistical methods in combination with heuristic filters. Here we develop VarNet, an end-to-end deep learning approach for identification of somatic variants from aligned tumor and matched normal DNA reads. VarNet is trained using image representations of 4.6 million high-confidence somatic variants annotated in 356 tumor whole genomes. We benchmark VarNet across a range of publicly available datasets, demonstrating performance often exceeding current state-of-the-art methods. Overall, our results demonstrate how a scalable deep learning approach could augment and potentially supplant human engineered features and heuristic filters in somatic variant calling.
<<<
翻译
33.
沈么是快乐星球
(2022-07-29 08:53):
#paper doi:10.1038/s41467-020-19681-1 Nature Communications, 2020, Genome-enabled discovery of anthraquinone biosynthesis in Senna tora.决明作为一种中草药,主要活性物质为其大量蒽醌,蒽醌主要存在于种子中。本文通过全基因组测序,比较基因组学分析发现决明中CHS基因家族的特异快速扩展的特征,且集中分布在染色体7上;通过不同发育时期种子的代谢物测定与转录组测定,筛选出3个候选基因,根据表达模式,进化关系与基因结构确定一个候选基因,并选择亲缘关系较远的另一个CHS基因家族为阴性对照;最后通过体外酶学反应进行验证(候选基因表达蛋白、失活候选基因表达蛋白、阴性对照蛋白,仅候选基因蛋白催化底物生成下游产物)。思路简单明了,在筛选候选基因时,使用了基因表达模式与代谢物表达模式相似的基因簇为基础,并构建了“代谢库”,分析其主要富集的代谢通路。在进行酶学反应时,因涉及到大部分的代谢知识,还并未详细研究。
Abstract:
Senna tora is a widely used medicinal plant. Its health benefits have been attributed to the large quantity of anthraquinones, but how they are made in plants remains a mystery. …
>>>
Senna tora is a widely used medicinal plant. Its health benefits have been attributed to the large quantity of anthraquinones, but how they are made in plants remains a mystery. To identify the genes responsible for plant anthraquinone biosynthesis, we reveal the genome sequence of S. tora at the chromosome level with 526 Mb (96%) assembled into 13 chromosomes. Comparison among related plant species shows that a chalcone synthase-like (CHS-L) gene family has lineage-specifically and rapidly expanded in S. tora. Combining genomics, transcriptomics, metabolomics, and biochemistry, we identify a CHS-L gene contributing to the biosynthesis of anthraquinones. The S. tora reference genome will accelerate the discovery of biologically active anthraquinone biosynthesis pathways in medicinal plants.
<<<
翻译
34.
颜林林
(2022-07-04 20:59):
#paper doi:10.1038/s41467-022-31236-0, Nature Communications, 2022, A convolutional neural network highlights mutations relevant to antimicrobial resistance in Mycobacterium tuberculosis. 本文建立了一套CNN(卷积神经网络)模型,从2万多个结核分枝杆菌的测序数据中,使用18个根据先验知识挑选的与其耐药性相关的基因座,将基因座的整个序列作为输入,以此来预测耐药性。结果显示,该CNN模型性能超过了目前其他基于传统机器学习方法和非卷积的常规神经网络方法。而且,由于深度学习方法提取了序列中的隐含特征信息,可以有效帮助预测未知突变对耐药性的影响。
IF:14.700Q1
Nature communications,
2022-07-02.
DOI: 10.1038/s41467-022-31236-0
PMID: 35780211
PMCID:PMC9250494
Abstract:
Long diagnostic wait times hinder international efforts to address antibiotic resistance in M. tuberculosis. Pathogen whole genome sequencing, coupled with statistical and machine learning models, offers a promising solution. However, …
>>>
Long diagnostic wait times hinder international efforts to address antibiotic resistance in M. tuberculosis. Pathogen whole genome sequencing, coupled with statistical and machine learning models, offers a promising solution. However, generalizability and clinical adoption have been limited by a lack of interpretability, especially in deep learning methods. Here, we present two deep convolutional neural networks that predict antibiotic resistance phenotypes of M. tuberculosis isolates: a multi-drug CNN (MD-CNN), that predicts resistance to 13 antibiotics based on 18 genomic loci, with AUCs 82.6-99.5% and higher sensitivity than state-of-the-art methods; and a set of 13 single-drug CNNs (SD-CNN) with AUCs 80.1-97.1% and higher specificity than the previous state-of-the-art. Using saliency methods to evaluate the contribution of input sequence features to the SD-CNN predictions, we identify 18 sites in the genome not previously associated with resistance. The CNN models permit functional variant discovery, biologically meaningful interpretation, and clinical applicability.
<<<
翻译
35.
Vincent
(2022-04-30 21:26):
#paper https://doi.org/10.1038/s41467-020-17678-4 A deep learning model to predict RNA-Seq expression of tumours from whole slide images. Nature Comm (2020) 深度学习模型(CNN)在医学影像中有广泛的应用,最近也有研究指出可以通过病理图片来预测DNA突变和突变数,但是还没有研究关注过是否可以通过病理图片来预测基因表达,这篇文章填补了这部分空白。文章提出了一种基于多任务弱监督的深度学习模型 HE2RNA, 使用TCGA不同癌症类型数据(WSI + RNA-seq)进行训练,发现能准确预测基因的数量主要取决于训练数据集的大小,对这些被准确预测的基因进行富集分析,发现他们集中在免疫和T细胞调控,细胞周期,和癌症hallmark的通路上。最后文章还展现HE2RNA可以用于基因表达的空间可视化(预测基因在slide上表达)和提高MSI预测效果
IF:14.700Q1
Nature communications,
2020-08-03.
DOI: 10.1038/s41467-020-17678-4
PMID: 32747659
PMCID:PMC7400514
Abstract:
Deep learning methods for digital pathology analysis are an effective way to address multiple clinical questions, from diagnosis to prediction of treatment outcomes. These methods have also been used to …
>>>
Deep learning methods for digital pathology analysis are an effective way to address multiple clinical questions, from diagnosis to prediction of treatment outcomes. These methods have also been used to predict gene mutations from pathology images, but no comprehensive evaluation of their potential for extracting molecular features from histology slides has yet been performed. We show that HE2RNA, a model based on the integration of multiple data modes, can be trained to systematically predict RNA-Seq profiles from whole-slide images alone, without expert annotation. Through its interpretable design, HE2RNA provides virtual spatialization of gene expression, as validated by CD3- and CD20-staining on an independent dataset. The transcriptomic representation learned by HE2RNA can also be transferred on other datasets, even of small size, to increase prediction performance for specific molecular phenotypes. We illustrate the use of this approach in clinical diagnosis purposes such as the identification of tumors with microsatellite instability.
<<<
翻译
36.
草莓味儿
(2022-03-31 21:36):
#paper DNA methylation loss promotes immune evasion of tumours with high mutation and copy number load
DOI: 10.1038/s41467-019-12159-9
本篇文章的研究者对各种肿瘤类型的TCGA数据进行了大规模的系统分析,研究了全局甲基化水平与细胞增殖,突变负荷,SCNA水平,浸润性免疫细胞标记和免疫应答基因活性之间的关系。结果表明作为免疫治疗中重要的预测标志物,基因组去甲基化与表观遗传调控有关,可以作为精确免疫治疗的联合方案。
主要研究内容包括:通过细胞系数据,确定了与正常细胞相比在癌症中更早或更晚复制的基因;另外,晚期复制区的甲基化缺失也称为部分甲基化结构域(PMD),与免疫基因抑制相关;全局甲基化预测免疫疗法的反应:采用肺癌队列测试了由分子分析得出的假设,需要指出的是这是第一项针对癌症免疫疗法的分子和临床数据中的DNA甲基化模式的研究;全局去甲基化排除了非整倍性的影响。总的来说,表观遗传调节和检查点阻断相结合,可以作为一种潜在的精准免疫治疗方案。
IF:14.700Q1
Nature communications,
2019-09-19.
DOI: 10.1038/s41467-019-12159-9
PMID: 31537801
PMCID:PMC6753140
Abstract:
Mitotic cell division increases tumour mutation burden and copy number load, predictive markers of the clinical benefit of immunotherapy. Cell division correlates also with genomic demethylation involving methylation loss in …
>>>
Mitotic cell division increases tumour mutation burden and copy number load, predictive markers of the clinical benefit of immunotherapy. Cell division correlates also with genomic demethylation involving methylation loss in late-replicating partial methylation domains. Here we find that immunomodulatory pathway genes are concentrated in these domains and transcriptionally repressed in demethylated tumours with CpG island promoter hypermethylation. Global methylation loss correlated with immune evasion signatures independently of mutation burden and aneuploidy. Methylome data of our cohort (n = 60) and a published cohort (n = 81) in lung cancer and a melanoma cohort (n = 40) consistently demonstrated that genomic methylation alterations counteract the contribution of high mutation burden and increase immunotherapeutic resistance. Higher predictive power was observed for methylation loss than mutation burden. We also found that genomic hypomethylation correlates with the immune escape signatures of aneuploid tumours. Hence, DNA methylation alterations implicate epigenetic modulation in precision immunotherapy.
<<<
翻译