来自用户 颜林林 的文献。
当前共找到 122 篇文献分享,本页显示第 101 - 120 篇。
101.
颜林林 (2022-06-15 06:27):
#paper doi:10.1186/s12859-022-04783-y BMC Bioinformatics, 2022, CancerNet: a unified deep learning network for pan-cancer diagnostics. 这篇文章建立了一个通用的深度神经网络模型,基于来自TCGA的33种癌症的甲基化数据,检测癌症及其起源组织。同样的任务在2022年已有相应工作,能够达到96%的总体准确率。本文则通过同时使用无监督与有监督的方法,让模型在输出34个分类结果(33个癌种+1个正常非癌)的同时,也额外生成一组重新构造的CpG岛甲基化信息,并将生成的此信息,与用于模型输入的CpG到甲基化信息进行比对,损失函数中同时纳入了该比对差异。通过这种方式,模型整体性能得到进一步提高,总体准确率达到99.6%。此外,本文也同时考察了年龄、转移等混杂因素对模型的影响,并为未来研究和开发模型的可解释性提供了基础。整个研究基于OSF(开放科学框架)进行,数据和源代码都完全开放,是一份不错的学习材料。
IF:2.900Q1 BMC bioinformatics, 2022-Jun-13. DOI: 10.1186/s12859-022-04783-y PMID: 35698059
Abstract:
BACKGROUND: Despite remarkable advances in cancer research, cancer remains one of the leading causes of death worldwide. Early detection of cancer and localization of the tissue of its origin are … >>>
BACKGROUND: Despite remarkable advances in cancer research, cancer remains one of the leading causes of death worldwide. Early detection of cancer and localization of the tissue of its origin are key to effective treatment. Here, we leverage technological advances in machine learning or artificial intelligence to design a novel framework for cancer diagnostics. Our proposed framework detects cancers and their tissues of origin using a unified model of cancers encompassing 33 cancers represented in The Cancer Genome Atlas (TCGA). Our model exploits the learned features of different cancers reflected in the respective dysregulated epigenomes, which arise early in carcinogenesis and differ remarkably between different cancer types or subtypes, thus holding a great promise in early cancer detection.RESULTS: Our comprehensive assessment of the proposed model on the 33 different tissues of origin demonstrates its ability to detect and classify cancers to a high accuracy (> 99% overall F-measure). Furthermore, our model distinguishes cancers from pre-cancerous lesions to metastatic tumors and discriminates between hypomethylation changes due to age related epigenetic drift and true cancer.CONCLUSIONS: Beyond detection of primary cancers, our proposed computational model also robustly detects tissues of origin of secondary cancers, including metastatic cancers, second primary cancers, and cancers of unknown primaries. Our assessment revealed the ability of this model to characterize pre-cancer samples, a significant step forward in early cancer detection. Deployed broadly this model can deliver accurate diagnosis for a greatly expanded target patient population. <<<
翻译
102.
颜林林 (2022-06-14 00:32):
#paper doi:10.1002/humu.24378 Human Mutation, 2022, Short amplicon reverse transcription-polymerase chain reaction detects aberrant splicing in genes with low expression in blood missed by ribonucleic acid sequencing analysis for clinical diagnosis. 这篇文章的标题很守规矩,把一大堆缩写都展开写全了,害我仔细辨认了半天:“reverse transcription-polymerase chain reaction” 其实是 rt-PCR,“ribonucleic acid sequencing” 其实是 RNA-seq。原来这是个说明“在某些情况下,rt-PCR相比RNA-seq更好”的故事。文章从另一个研究(Splicing and Disease Research Study)中选出了13个实际临床病例,它们在一些血液中通常低表达的基因上,已知存在诸如“外显子跳跃”这样的剪切相关突变,将这些病例的外周血样本,提取RNA后,分别进行RNA-seq和rt-PCR,确认了短片段rt-PCR的确能够有效且更灵敏地检出这些突变,而在RNA-seq中因为表达量太低而难以检出。从而验证了短片段rt-PCR方法可用于此类低表达基因的剪切相关突变的检测。
IF:3.300Q2 Human mutation, 2022-07. DOI: 10.1002/humu.24378 PMID: 35476365
Abstract:
Use of blood RNA sequencing (RNA-seq) as a splicing analysis tool for clinical interpretation of variants of uncertain significance (VUSs) found via whole-genome and exome sequencing can be difficult for … >>>
Use of blood RNA sequencing (RNA-seq) as a splicing analysis tool for clinical interpretation of variants of uncertain significance (VUSs) found via whole-genome and exome sequencing can be difficult for genes that have low expression in the blood due to insufficient read count coverage aligned to specific genes of interest. Here, we present a short amplicon reverse transcription-polymerase chain reaction(RT-PCR) for the detection of genes with low blood expression. Short amplicon RT-PCR, is designed to span three exons where an exon harboring a variant is flanked by one upstream and one downstream exon. We tested short amplicon RT-PCRs for genes that have median transcripts per million (TPM) values less than one according to the genotype-tissue expression database. Median TPM values of genes analyzed in this study are SYN1 = 0.8549, COL1A1 = 0.6275, TCF4 = 0.4009, DSP = .2894, TTN = 0.2851, COL5A2 = 0.1036, TERT = 0.04452, NTRK2 = 0.0344, ABCA4 = 0.00744, PRPH = 0, and WT1 = 0. All these genes show insufficient exon-spanning read coverage in our RNA-seq data to allow splicing analysis. We successfully detected all genes tested except PRPH and WT1. Aberrant splicing was detected in SYN1, TCF4, NTRK2, TTN, and TERT VUSs. Therefore, our results show short amplicon RT-PCR is a useful alternative for the analysis of splicing events in genes with low TPM in blood RNA for clinical diagnostics. <<<
翻译
103.
颜林林 (2022-06-13 05:47):
#paper doi:10.1038/s41588-022-01082-3 Nature Genetics, 2022, Genomic analysis defines clonal relationships of ductal carcinoma in situ and recurrent invasive breast cancer. 本文研究导管原位癌(DCIS)。该疾病常见于乳腺癌筛查,即使经过治疗,也仍然有小部分患者会恶化复发成为浸润性乳腺癌。本文试图研究,恶化的这些患者,是否都来自原发性DCIS的复发克隆,亦或仅是新发的无关疾病。为此,本研究纳入了129对DCIS复发病例样本(即原位DCIS样本和复发样本;同时也包含匹配的癌旁对照组织),通过开展全外显子组测序、SNP芯片检测或靶向基因组panel测序(这里技术平台方法存在差别,是因为样本及其实验,分别来自和开展于荷兰、英国和美国的三家不同单位),进行基因组突变分析和拷贝数变异分析。同时也从中选取了4例病例,将其原发与复发组织,分别进行解离并开展单细胞基因组测序。针对这两种策略,都分别进行了克隆演化分析,最终确认并非所有同侧浸润性乳腺癌都与先前的 DCIS 有克隆相关性,其中有约五分之一其实为新发原发性癌症。此结果也在更大范围且更详细的程度上,验证了前人的研究结果。
IF:31.700Q1 Nature genetics, 2022-06. DOI: 10.1038/s41588-022-01082-3 PMID: 35681052
Abstract:
Ductal carcinoma in situ (DCIS) is the most common form of preinvasive breast cancer and, despite treatment, a small fraction (5-10%) of DCIS patients develop subsequent invasive disease. A fundamental … >>>
Ductal carcinoma in situ (DCIS) is the most common form of preinvasive breast cancer and, despite treatment, a small fraction (5-10%) of DCIS patients develop subsequent invasive disease. A fundamental biologic question is whether the invasive disease arises from tumor cells in the initial DCIS or represents new unrelated disease. To address this question, we performed genomic analyses on the initial DCIS lesion and paired invasive recurrent tumors in 95 patients together with single-cell DNA sequencing in a subset of cases. Our data show that in 75% of cases the invasive recurrence was clonally related to the initial DCIS, suggesting that tumor cells were not eliminated during the initial treatment. Surprisingly, however, 18% were clonally unrelated to the DCIS, representing new independent lineages and 7% of cases were ambiguous. This knowledge is essential for accurate risk evaluation of DCIS, treatment de-escalation strategies and the identification of predictive biomarkers. <<<
翻译
104.
颜林林 (2022-06-12 07:49):
#paper doi:10.1038/s41598-022-13336-5 Scientific Reports, 2022, Omics-based integrated analysis identified IKZF2 as a biomarker associated with lupus nephritis. 相信很多人知道系统性红斑狼疮(SLE)这个疾病,跟我一样都来自二十多年前的一部火遍大江南北的虐心小说《第一次亲密接触》。而这篇文章所研究的,正是SLE的重要并发症和主要致死因素狼疮性肾炎(LN)。本文收集并挖掘了肾脏组织的公共数据,包括LN患者的肾小管间质和肾小体组织,也包括肾移植捐献者的健康肾组织,由这些数据找到26个常见差异表达基因(co-DEGs)。在此基础上,将其中的 IKZF2 基因作为重点,通过功能富集、蛋白-蛋白相互作用网络、ceRNA网络构建、免疫浸润、风险评估等常用生信方法进行分析,从而确定了 IKZF2 基因在 LN 疾病方面的预测和评估价值。文章的方法本身没有多少亮点,属于常见的套路玩法,应该是所选择的临床问题,为其提供了一定创新性和研究价值。
IF:3.800Q1 Scientific reports, 2022-06-10. DOI: 10.1038/s41598-022-13336-5 PMID: 35688845
Abstract:
Lupus nephritis (LN) is a crucial complication of systemic lupus erythematosus (SLE). IKZF2 was identified as a lupus susceptibility locus, while its exact molecular function in LN is unknown. We … >>>
Lupus nephritis (LN) is a crucial complication of systemic lupus erythematosus (SLE). IKZF2 was identified as a lupus susceptibility locus, while its exact molecular function in LN is unknown. We aimed to explore the relationship between IKZF2 and LN based on multi-omics data. In our study, we carried out a meta-analysis of publicly available data, including not only tubulointerstitium, but also glomerulus tissue samples from LN patients and controls. Based on the common differentially expressed genes (co-DEGs) and previous researches, we selected IKZF2 for further analysis. Then, we analyzed potential molecular mechanisms of co-DEGs and IKZF2 in LN. To explore the possible targets of IKZF2, protein-protein interaction network (PPI) network and ceRNA network of IKZF2 were also constructed. Moreover, we performed immune infiltration analysis and evaluated clinical value of IKZF2. A total of 26 co-DEGs were observed in the integration of the above DEGs coming from the four sets of data, of which IKZF2 was selected for further analysis. Functional enrichment analysis from IKZF2 and related PPI network confirmed the tight relationship between IKZF2 and the immune reaction. Moreover, immune filtration analysis revealed the significant correlation between IKZF2 and naïve B cell, NK cell activation, NK cell rest and other immune cells. Receiver operating characteristic (ROC) analysis showed that the areas under the ROC curves were 0.721, 0.80, 0.682, and 0.859 for IKZF2 in four datasets, which demonstrated the clinical value of IKZF2. Our study revealed that IKZF2 may play an essential role in the molecular function and development of LN, and might be a potential biomarker for distinguishing LN patients and healthy ones. <<<
翻译
105.
颜林林 (2022-06-11 14:48):
#paper doi:10.1016/j.cell.2022.04.016 Cell, 2022, Structural basis for RNA surveillance by the human nuclear exosome targeting (NEXT) complex. 这篇发表在最新一期《Cell》杂志上的文章,来自MSKCC(纪念斯隆·凯特琳癌症中心),仅有两位署名作者,M. Rhyan Puno 和 Christopher D. Lima。这项研究主要是基于冷冻电镜(cryo-EM),研究人细胞核外切体靶向(NEXT)复合物的分子结构。标题中的exosome是包含多种核酸外切酶的蛋白复合体,在细胞中起到外切和降解RNA的作用,是关乎RNA分子生存期及细胞内稳态的重要机制。另一个在液体活检领域常见的概念“外泌体”英文单词也是exosome,但其为包裹和使细胞向外分泌蛋白与核酸等分子的具有磷酸双分子层膜的囊泡结构,与此篇文章的exosome无关,应避免混淆。冷冻电镜是一种可以使生物大分子尽量维持在生物体内活性状态下,并被测定其原子级别高分辨率结构的技术。本文基于它,详细分析了组成 NEXT 复合物的核心蛋白 MTR4、RBM7 和 ZCCHC8 的结构及组装关系,包括它们所形成的复合物,结合底物 RNA 的通道。并结合其他分子实验,包括突变体细胞系构建、免疫沉淀、RNA表达谱测序等,分析和确认了它们在识别底物 RNA 过程中的作用。对 ZCCHC8-ROS1 融合等突变形式,对相应酶活性的影响,以及所导致的表型或疾病发生,也做了相应的研究和讨论。本文应该算是一篇典型的结构生物学研究文章,所研究的内容,属于普遍存在于所有真核生物与古菌生物的基础生物学问题,具有教科书级的重要意义。
IF:45.500Q1 Cell, 2022-06-09. DOI: 10.1016/j.cell.2022.04.016 PMID: 35688134
Abstract:
RNA quality control relies on co-factors and adaptors to identify and prepare substrates for degradation by ribonucleases such as the 3' to 5' ribonucleolytic RNA exosome. Here, we determined cryogenic … >>>
RNA quality control relies on co-factors and adaptors to identify and prepare substrates for degradation by ribonucleases such as the 3' to 5' ribonucleolytic RNA exosome. Here, we determined cryogenic electron microscopy structures of human nuclear exosome targeting (NEXT) complexes bound to RNA that reveal mechanistic insights to substrate recognition and early steps that precede RNA handover to the exosome. The structures illuminate ZCCHC8 as a scaffold, mediating homodimerization while embracing the MTR4 helicase and flexibly anchoring RBM7 to the helicase core. All three subunits collaborate to bind the RNA, with RBM7 and ZCCHC8 surveying sequences upstream of the 3' end to facilitate RNA capture by MTR4. ZCCHC8 obscures MTR4 surfaces important for RNA binding and extrusion as well as MPP6-dependent recruitment and docking onto the RNA exosome core, interactions that contribute to RNA surveillance by coordinating RNA capture, translocation, and extrusion from the helicase to the exosome for decay. <<<
翻译
106.
颜林林 (2022-06-10 07:29):
#paper doi:10.1186/1471-2199-7-3 BMC Molecular Biology, 2006, The RIN: an RNA integrity number for assigning integrity values to RNA measurements. 在分子生物学实验中,涉及到RNA质控,评估其分子完整性,最重要且最广泛使用的指标,当属RIN值(RNA integrity Number)。16年前的这篇文章,正是关于RIN值算法及建立过程的工作。Agilent、Roche 和 Quantiom bioinformatics 等单位参与了此项工作。该算法成为至今仍在使用的 2100 生物分析仪的标配和重要输出指标。在RIN值之前,通常使用 28S和18S rRNA的比值来进行评估(一般要求达到至少2.0),而这个比值受电泳胶图展示和手工测量的影响,经常不够稳定,在实验室之间存在很大差异,更重要的,其与RNA分子的完整性经常并不相关。于是,本文开发了一套方法,使用毛细管电泳技术,采集到样本中的所有不同长度核酸分子的丰度信息,由此自动提取特征,基于贝叶斯方法和神经网络算法,构建回归模型,并最终选择出估计RNA完整性的特征组合,并计算出RIN值(取值1-10,1代表完全降解,10代表无降解)。研究者从人、大鼠、小鼠的不同器官组织,以及各类细胞系中,分别提取了RNA,共收集了1208份样本,这其中主要是未降解的完整样本和完全降解的样本,此外也包括了足够的部分降解的样本。通过不同比例的组合将其混合,构造了一套包含各种不同降解程度的实际样本,用于产出数据、提取特征后构造分类模型,以及对该分类模型(模型输出为RIN值,分别为1-10的整数)的性能评估。同时也将模型算法计算得到的rRNA比值、RIN值,与(4个看家基因的)rtPCR数据进行了对比,确认其对RNA质量和完整程度的代表性。
BMC molecular biology, 2006-Jan-31. PMID: 16448564 PMCID:PMC1413964
Abstract:
BACKGROUND: The integrity of RNA molecules is of paramount importance for experiments that try to reflect the snapshot of gene expression at the moment of RNA extraction. Until recently, there … >>>
BACKGROUND: The integrity of RNA molecules is of paramount importance for experiments that try to reflect the snapshot of gene expression at the moment of RNA extraction. Until recently, there has been no reliable standard for estimating the integrity of RNA samples and the ratio of 28S:18S ribosomal RNA, the common measure for this purpose, has been shown to be inconsistent. The advent of microcapillary electrophoretic RNA separation provides the basis for an automated high-throughput approach, in order to estimate the integrity of RNA samples in an unambiguous way.METHODS: A method is introduced that automatically selects features from signal measurements and constructs regression models based on a Bayesian learning technique. Feature spaces of different dimensionality are compared in the Bayesian framework, which allows selecting a final feature combination corresponding to models with high posterior probability.RESULTS: This approach is applied to a large collection of electrophoretic RNA measurements recorded with an Agilent 2100 bioanalyzer to extract an algorithm that describes RNA integrity. The resulting algorithm is a user-independent, automated and reliable procedure for standardization of RNA quality control that allows the calculation of an RNA integrity number (RIN).CONCLUSION: Our results show the importance of taking characteristics of several regions of the recorded electropherogram into account in order to get a robust and reliable prediction of RNA integrity, especially if compared to traditional methods. <<<
翻译
107.
颜林林 (2022-06-09 07:56):
#paper doi:10.1038/s41586-022-04803-0 Nature, 2022, Cohesin-mediated loop anchors confine the locations of human replication origins. 这篇新发表在Nature上的文章,来自一项大型研究项目,The 4D nucleome project (doi:10.1038/nature23884),该项目旨在开发并应用一系列生物技术方法,研究人类和小鼠基因组在时空上的结构特性,以深入了解细胞核的组织和功能,4D指的正是三维空间结构加上时间动态变化。在这篇文章中,作者们主要对该项目 Phase 1 的 Tier 1 H1 Human hES (hES) 细胞系的 Hi-C 数据进行分析,通过其自研方法,鉴定出全基因组水平上高分辨率的 TADs/subTADs(拓扑关联域/拓扑关联子域),并分析它们与染色质loop、DNA复制起始区(IZ)之间的分布关系,结合这些数据的采集方法及细胞所处周期等信息,提出由cohesin介导的loop挤压和复制推动的相关模型。同时,使用靶向 CRISPR–Cas9 基因组编辑方法,干扰CTCF+cohesin后对复制的影响,也验证了该模型。这项工作展示了如何通过组学数据分析,提出在分子细胞生物学相应概念模型的研究方法,很值得学习。
IF:50.500Q1 Nature, 2022-06. DOI: 10.1038/s41586-022-04803-0 PMID: 35676475
Abstract:
DNA replication occurs through an intricately regulated series of molecular events and is fundamental for genome stability. At present, it is unknown how the locations of replication origins are determined … >>>
DNA replication occurs through an intricately regulated series of molecular events and is fundamental for genome stability. At present, it is unknown how the locations of replication origins are determined in the human genome. Here we dissect the role of topologically associating domains (TADs), subTADs and loops in the positioning of replication initiation zones (IZs). We stratify TADs and subTADs by the presence of corner-dots indicative of loops and the orientation of CTCF motifs. We find that high-efficiency, early replicating IZs localize to boundaries between adjacent corner-dot TADs anchored by high-density arrays of divergently and convergently oriented CTCF motifs. By contrast, low-efficiency IZs localize to weaker dotless boundaries. Following ablation of cohesin-mediated loop extrusion during G1, high-efficiency IZs become diffuse and delocalized at boundaries with complex CTCF motif orientations. Moreover, G1 knockdown of the cohesin unloading factor WAPL results in gained long-range loops and narrowed localization of IZs at the same boundaries. Finally, targeted deletion or insertion of specific boundaries causes local replication timing shifts consistent with IZ loss or gain, respectively. Our data support a model in which cohesin-mediated loop extrusion and stalling at a subset of genetically encoded TAD and subTAD boundaries is an essential determinant of the locations of replication origins in human S phase. <<<
翻译
108.
颜林林 (2022-06-08 07:44):
#paper doi:10.1016/j.gpb.2022.05.006 Genomics, Proteomics & Bioinformatics, 2022, Systematic cross-biospecimen evaluation of DNA extraction kits for long- and short-read multi-metagenomic sequencing studies. DNA提取是宏基因组研究中的第一步湿实验,其质量和稳定性对于后续数据结果产出至关重要。本文对入组受试者采集了胆汁、粪便、唾液、斑块、痰和结膜拭子样本,并使用三个商业试剂盒分别进行DNA提取实验。提取得到的DNA,分别建库和上机测序,包括使用二代测序(华大DNBSEQ-G400测序仪)和三代测序(纳米孔Mk1B MinION测序仪),用以分析和评估微生物组成。其结果显示,不同DNA提取试剂盒之间的差异确实很大,但不同样本类型之间的差异更大。而宏基因组的重要评估特征α多样性,也受到试剂盒及测序深度的明显影响。而相应地,在不同测序技术平台之间,所得到的微生物组成及分类概况基本是一致的,即偏倚主要还是来自于前期样本处理,而非后期建库测序过程。由于不同DNA提取试剂盒的微生物群组成差异很大,因而文章推荐,对于旨在直接比较来自同一患者的多个微生物群的研究,应该采取单一试剂盒的策略,以避免由于试剂盒选择带来的干扰。
Abstract:
High-quality DNA extraction is a crucial step in metagenomic studies. Bias by different isolation kits impairs the comparison across datasets. A trending topic is, however, the analysis of multiple metagenomes … >>>
High-quality DNA extraction is a crucial step in metagenomic studies. Bias by different isolation kits impairs the comparison across datasets. A trending topic is, however, the analysis of multiple metagenomes from the same patients to draw a holistic picture of microbiota associated with diseases. We thus collected bile, stool, saliva, plaque, sputum, and conjunctival swab samples and performed DNA extraction with three commercial kits. For each combination of the specimen type and DNA extraction kit, 20-gigabase (Gb) metagenomic data were generated using short-read sequencing. While profiles of the specimen types showed close proximity to each other, we observed notable differences in the alpha diversity and composition of the microbiota depending on the DNA extraction kits. No kit outperformed all selected kits on every specimen. We reached consistently good results using the Qiagen QiAamp DNA Microbiome Kit. Depending on the specimen, our data indicate that over 10 Gb of sequencing data are required to achieve sufficient resolution, but DNA-based identification is superior to identification by mass spectrometry. Finally, long-read nanopore sequencing confirmed the results (correlation coefficient > 0.98). Our results thus suggest using a strategy with only one kit for studies aiming for a direct comparison of multiple microbiotas from the same patients. <<<
翻译
109.
颜林林 (2022-06-07 07:13):
#paper doi:10.1101/gr.276521.121 Genome Research, 2022, Precision environmental health monitoring by longitudinal exposome and multi-omics profiling. 这是来自斯坦福大学Michael Snyder教授及其团队的一项针对个人暴露组学及多组学研究的文章。通过对一位受试者连续52天的持续监测,包括各类环境指标及暴露物质的检测、采集外周血并进行血常规、代谢组、蛋白组、细胞因子等检测、采集粪便进行肠道微生物检测,并将这些所获得的数据进行统计分析,寻找各组学之间的显著关联事件,以研究环境暴露与个体内部生物指标之间的关系。此类环境暴露组研究在过去通常都是基于人群进行的。这篇历经多年数据分析的多组学研究文章,工作量不小,但受限于样本量有限,很难得出有说服力的新颖结论,故整体上的基调以描述性展示数据结果为主。此外,这项研究的受试者,是一位61岁欧裔男性,且受过博士教育,从数据采集时间2016年推测,大概率正是1955年出生的Michael Snyder本人。
IF:6.200Q1 Genome research, 2022-06. DOI: 10.1101/gr.276521.121 PMID: 35667843
Abstract:
Conventional environmental health studies have primarily focused on limited environmental stressors at the population level, which lacks the power to dissect the complexity and heterogeneity of individualized environmental exposures. Here, … >>>
Conventional environmental health studies have primarily focused on limited environmental stressors at the population level, which lacks the power to dissect the complexity and heterogeneity of individualized environmental exposures. Here, as a pilot case study, we integrated deep-profiled longitudinal personal exposome and internal multi-omics to systematically investigate how the exposome shapes a single individual's phenome. We annotated thousands of chemical and biological components in the personal exposome cloud and found they were significantly correlated with thousands of internal biomolecules, which was further cross-validated using corresponding clinical data. Our results showed that agrochemicals and fungi predominated in the highly diverse and dynamic personal exposome, and the biomolecules and pathways related to the individual's immune system, kidney, and liver were highly associated with the personal external exposome. Overall, this data-driven longitudinal monitoring study shows the potential dynamic interactions between the personal exposome and internal multi-omics, as well as the impact of the exposome on precision health by producing abundant testable hypotheses. <<<
翻译
110.
颜林林 (2022-06-06 07:18):
#paper doi:10.1016/j.copbio.2022.102691 Current Opinion in Biotechnology, New opportunities for genetic code expansion in synthetic yeast. 这是一篇对基于酵母的人工从头构建的生物合成体系技术的综述和展望。为了探索生命的起源和演化机制等基本问题,人们很早就开始尝试从头合成生命,并且在这些年持续取得了大量进步,其中就包括完全人工合成酵母细胞的每一条染色体,并将它们装配起来,形成具有生物活性的人工酵母细胞体系。这个体系(合成酵母Sc2.0)目前已经接近完成,它将是第一个自下而上构建的真核生物。由于可以精确控制其遗传背景及其生物内环境,人们可以深入研究诸如密码子替换和引入非经典氨基酸等自然界中很难发生的场景,并由此精确控制蛋白质的结构和功能,包括解决诸如过量表达等常规细胞工程难以处理的实际问题,为开发基于蛋白质的治疗剂、材料和催化剂提供了新的机会。
Abstract:
The synthetic yeast, Sc2.0, is nearing completion as consolidation of all 17 synthetic chromosomes into a single cell advances. This organism will be the first synthetic eukaryote and provides a … >>>
The synthetic yeast, Sc2.0, is nearing completion as consolidation of all 17 synthetic chromosomes into a single cell advances. This organism will be the first synthetic eukaryote and provides a highly plastic biological chassis built from the bottom-up using principles of biological design. This synthetic approach to genome construction has allowed the genetic code to be re-wired in this background to liberate the amber stop codon as a dedicated triplet for encoding non-canonical amino acids. The availability of an expanded set of amino acid building blocks allows precise control of protein structure and function, providing new opportunities to develop protein-based therapeutics, materials and catalysts. In this article, we review the challenges facing genetic code expansion research in yeast and highlight how the development of Sc2.0 provides new and exciting opportunities to address existing limitations. <<<
翻译
111.
颜林林 (2022-06-05 06:41):
#paper doi:10.1186/s12864-022-08435-6 BMC Genomics, Frameshift and wild-type proteins are often highly similar because the genetic code and genomes were optimized for frameshift tolerance. 在基因组和分子演化的研究中,我们通常认为(非3整倍长度的)移码突变会造成蛋白功能的完全丧失,而关于密码子表的容错能力通常也局限在第三个碱基上。这篇文章从方法上看,是一项非常经典的纯生物信息学研究,它基于公共数据和序列分析方法,对移码突变的特性进行分析,发现并验证了移码突变后的蛋白,与突变前野生型的蛋白,在序列和氨基酸理化性质等各方面,是保留有一定相似性的。这种保留,与完全随机的突变相比,是存在显著差异的。从而证明了各物种的基因组序列,以及密码子表,在对移码突变的容错方面,是“经过优化”的。这为密码子表的演化形成提供了新的角度及思路。在拥有大量公开生物序列数据的今天,充分利用这些数据,基于少量简单合理的假设前提,辅以诸如序列分析这样的生信基础技术和相应的统计检验过程,来回答一些基础生物学问题,做得比较认真和扎实,我个人很喜欢这样的研究工作。
IF:3.500Q2 BMC genomics, 2022-Jun-02. DOI: 10.1186/s12864-022-08435-6 PMID: 35655139
Abstract:
Frameshift mutations have been considered of significant importance for the molecular evolution of proteins and their coding genes, while frameshift protein sequences encoded in the alternative reading frames of coding … >>>
Frameshift mutations have been considered of significant importance for the molecular evolution of proteins and their coding genes, while frameshift protein sequences encoded in the alternative reading frames of coding genes have been considered to be meaningless. However, functional frameshifts have been found widely existing. It was puzzling how a frameshift protein kept its structure and functionality while substantial changes occurred in its primary amino-acid sequence. This study shows that the similarities among frameshifts and wild types are higher than random similarities and are determined at different levels. Frameshift substitutions are more conservative than random substitutions in the standard genetic code (SGC). The frameshift substitutions score of SGC ranks in the top 2.0-3.5% of alternative genetic codes, showing that SGC is nearly optimal for frameshift tolerance. In many genes and certain genomes, frameshift-resistant codons and codon pairs appear more frequently than expected, suggesting that frameshift tolerance is achieved through not only the optimality of the genetic code but, more importantly, the further optimization of a specific gene or genome through the usages of codons/codon pairs, which sheds light on the role of frameshift mutations in molecular and genomic evolution. <<<
翻译
112.
颜林林 (2022-06-04 07:46):
#paper doi:10.3322/caac.21727 CA: A Cancer Journal for Clinicians, 2022, Oncologic emergencies and urgencies: A comprehensive review. 这是一篇关于肿瘤急诊的综述。虽然这些年来,得益于肿瘤诊治相关研究的快速进步,肿瘤整体上越来越趋近于慢性病,但其死亡率和危害仍然位居前列。急诊诊断肿瘤往往属晚期,占所有新发肿瘤的11%至29%,且预后都较差。本文对这类紧急情况,包括其潜在病因和临床处置路径等,进行了全面的综述。包括肿瘤发展过程中常见的发热性中性粒细胞减少症、高钙血症、肿瘤溶解综合征、恶性脊髓压迫、机械性肠梗阻和突发性疼痛等症状,也包括其他特定癌种伴发的抗利尿激素分泌不当综合征、静脉血栓栓塞和恶性积液等。此外,对于因为肿瘤治疗,包括小分子靶向药物、免疫检查点抑制剂、CAR-T治疗等导致的不良反应和紧急并发症,也都做了详细介绍。记得在一次学术交流中,有临床医生提及其自己的经验,那些免疫治疗过程中出现不良反应的患者,如果能及时处理和控制好并发症,经常其对药物的响应和疗效是更明显。因此,了解肿瘤急诊各类情况及其处置,对于探明肿瘤危害及病因,以及发展完善各类诊治方法,都是有重要意义的。
Abstract:
Patients with advanced cancer generate 4 million visits annually to emergency departments (EDs) and other dedicated, high-acuity oncology urgent care centers. Because of both the increasing complexity of systemic treatments … >>>
Patients with advanced cancer generate 4 million visits annually to emergency departments (EDs) and other dedicated, high-acuity oncology urgent care centers. Because of both the increasing complexity of systemic treatments overall and the higher rates of active therapy in the geriatric population, many patients experiencing acute decompensations are frail and acutely ill. This article comprehensively reviews the spectrum of oncologic emergencies and urgencies typically encountered in acute care settings. Presentation, underlying etiology, and up-to-date clinical pathways are discussed. Criteria for either a safe discharge to home or a transition of care to the inpatient oncology hospitalist team are emphasized. This review extends beyond familiar conditions such as febrile neutropenia, hypercalcemia, tumor lysis syndrome, malignant spinal cord compression, mechanical bowel obstruction, and breakthrough pain crises to include a broader spectrum of topics encompassing the syndrome of inappropriate antidiuretic hormone secretion, venous thromboembolism and malignant effusions, as well as chemotherapy-induced mucositis, cardiomyopathy, nausea, vomiting, and diarrhea. Emergent and urgent complications associated with targeted therapeutics, including small molecules, naked and drug-conjugated monoclonal antibodies, as well as immune checkpoint inhibitors and chimeric antigen receptor T-cells, are summarized. Finally, strategies for facilitating same-day direct admission to hospice from the ED are discussed. This article not only can serve as a point-of-care reference for the ED physician but also can assist outpatient oncologists as well as inpatient hospitalists in coordinating care around the ED visit. <<<
翻译
113.
颜林林 (2022-06-03 12:22):
#paper doi:10.1038/s41586-022-04759-1 Nature 2022, Fast and efficient DNA replication with purified human proteins. 这篇新发表的Nature文章,是还原论研究方法的典范。通过人工合成的方法,用43个多肽,从头构建了体外的多个DNA合成因子,并组装成具有生物活性的复合物,重现了快速高效的DNA合成过程。由于是完全人工合成,所以通过这个体系,可以研究各个组分在DNA复制过程中的必要性,以及组分之间的相互关系,并确认可以不需要其他更多额外分子。这项研究重点研究了人体DNA合成中的PCNA、CLASPIN、TIMELESS-TIPIN、AND-1等组分在相应合成步骤中的作用。
IF:50.500Q1 Nature, 2022-06. DOI: 10.1038/s41586-022-04759-1 PMID: 35585232
Abstract:
Chromosome replication is performed by a complex and intricate ensemble of proteins termed the replisome, where the DNA polymerases Polδ and Polε, DNA polymerase α-primase (Polα) and accessory proteins including … >>>
Chromosome replication is performed by a complex and intricate ensemble of proteins termed the replisome, where the DNA polymerases Polδ and Polε, DNA polymerase α-primase (Polα) and accessory proteins including AND-1, CLASPIN and TIMELESS-TIPIN (respectively known as Ctf4, Mrc1 and Tof1-Csm3 in Saccharomyces cerevisiae) are organized around the CDC45-MCM-GINS (CMG) replicative helicase. Because a functional human replisome has not been reconstituted from purified proteins, how these factors contribute to human DNA replication and whether additional proteins are required for optimal DNA synthesis are poorly understood. Here we report the biochemical reconstitution of human replisomes that perform fast and efficient DNA replication using 11 purified human replication factors made from 43 polypeptides. Polε, but not Polδ, is crucial for optimal leading-strand synthesis. Unexpectedly, Polε-mediated leading-strand replication is highly dependent on the sliding-clamp processivity factor PCNA and the alternative clamp loader complex CTF18-RFC. We show how CLASPIN and TIMELESS-TIPIN contribute to replisome progression and demonstrate that, in contrast to the budding yeast replisome, AND-1 directly augments leading-strand replication. Moreover, although AND-1 binds to Polα, the interaction is dispensable for lagging-strand replication, indicating that Polα is functionally recruited via an AND-1-independent mechanism for priming in the human replisome. Collectively, our work reveals how the human replisome achieves fast and efficient leading-strand and lagging-strand DNA replication, and provides a powerful system for future studies of the human replisome and its interactions with other DNA metabolic processes. <<<
翻译
114.
颜林林 (2022-06-02 07:08):
#paper doi:10.1101/gr.276193.121 Genome Research 2022, Genetic, epigenetic, and environmental mechanisms govern allele-specific gene expression. 二倍体生物的每个基因都有两个等位基因拷贝,等位基因的表达并非随机发生,哪些因素决定了到底开启哪条等位基因进行表达,是个值得研究的问题。本文通过杂交两个品系的小鼠,构建出在大多数基因上都呈现杂合型的F1代,并用这个模型进行等位基因特异性表达(Allele-Specific Expression,ASE)的研究,使得通过测序方法可以很容易大批量识别出所表达基因的来源等位基因(来自父方或母方)。由于是动物模型实验,可以很容易进行是否高脂饮食等对照实验,以及可以进行不同器官组织类型的样本采集,研究在不同组别、不同组织中的ASE事件。通过这套体系,本文发现了几千个存在ASE的基因,并研究了基因序列、基因表观状态、饮食习惯以及不同器官组织等因素对ASE的影响。
IF:6.200Q1 Genome research, 2022-06. DOI: 10.1101/gr.276193.121 PMID: 35501130
Abstract:
Allele-specific expression (ASE) is a phenomenon in which one allele is preferentially expressed over the other. Genetic and epigenetic factors cause ASE by altering the final composition of a gene's … >>>
Allele-specific expression (ASE) is a phenomenon in which one allele is preferentially expressed over the other. Genetic and epigenetic factors cause ASE by altering the final composition of a gene's product, leading to expression imbalances that can have functional consequences on phenotypes. Environmental signals also impact allele-specific expression, but how they contribute to this cross talk remains understudied. Here, we explored how genotype, parent-of-origin, tissue, sex, and dietary fat simultaneously influence ASE biases. Male and female mice from a F reciprocal cross of the LG/J and SM/J strains were fed a high or low fat diet. We harnessed strain-specific variants to distinguish between two ASE classes: parent-of-origin-dependent (unequal expression based on parental origin) and sequence-dependent (unequal expression based on nucleotide identity). We present a comprehensive map of ASE patterns in 2853 genes across three tissues and nine environmental contexts. We found that both ASE classes are highly dependent on tissue and environmental context. They vary across metabolically relevant tissues, between males and females, and in response to dietary fat. We also found 45 genes with inconsistent ASE biases that switched direction across tissues and/or environments. Finally, we integrated ASE and QTL data from published intercrosses of the LG/J and SM/J strains. Our ASE genes are often enriched in QTLs for metabolic and musculoskeletal traits, highlighting how this orthogonal approach can prioritize candidate genes. Together, our results provide novel insights into how genetic, epigenetic, and environmental mechanisms govern allele-specific expression, which is an essential step toward deciphering the genotype-to-phenotype map. <<<
翻译
115.
颜林林 (2022-06-01 07:41):
#paper doi:10.1101/2022.05.29.493900 bioRxiv 2022, Cost-efficient whole genome-sequencing using novel mostly natural sequencing-by-synthesis chemistry and open fluidics platform. 这是来自MIT的一家创业公司Ultima Genomics的新作品,它从设计原理上对当前“边合成边测序”的方法进行突破创新。通过在圆形大晶片上设计流控和光学系统,使相应的试剂耗材更加便宜。相对于Illumina测序在每个cycle进行可逆阻断的碱基追加方法,本文通过使用非阻断的方法,使碱基追加过程更加快速,同时配合一套CNN算法,来实现准确的base calling。实测下来,该测序方法可以做到在20小时以内、300bp长读长、Q30>85%高质量的高通量测序,且每Gb数据成本低于1美元。本文还使用GIAB及千人基因组的样本进行了基准测试,验证了测序结果的准确度。我们很多人天天都在围绕高通量测序做研究,早已把Illumina测序原理当做习以为常且理所当然的技术,默认了它的垄断和天花板地位,很少去考虑它还有什么可以进一步改善的地方。这篇文章是个拓展这方面眼界的机会。
Abstract:
We introduce a massively parallel novel sequencing platform that combines an open flow cell design on a circular wafer with a large surface area and mostly natural nucleotides that allow … >>>
We introduce a massively parallel novel sequencing platform that combines an open flow cell design on a circular wafer with a large surface area and mostly natural nucleotides that allow optical end-point detection without reversible terminators. This platform enables sequencing billions of reads with longer read length (~300bp) and fast runs times (<20hrs) with high base accuracy (Q30 > 85%), at a low cost of $1/Gb. We establish system performance by whole-genome sequencing of the Genome-In-A-Bottle reference samples HG001-7, demonstrating high accuracy for SNPs (99.6%) and Indels in homopolymers up to length 10 (96.4%) across the vast majority (>98%) of the defined high-confidence regions of these samples. We demonstrate scalability of the whole-genome sequencing workflow by sequencing an additional 224 selected samples from the 1000 Genomes project achieving high concordance with reference data. <<<
翻译
116.
颜林林 (2022-05-31 07:28):
#paper doi:10.1038/s41586-021-03583-3 Nature 2021, Swarm Learning for decentralized and confidential clinical machine learning. 精准医学的发展得益于数据的快速积累,然而数据共享却始终是数据充分使用的重大障碍。本文提出一种群体学习方法,它结合了边缘计算、区块链等技术,使数据拥有者可以在不违反隐私法规的情况下,让数据可以在全球范围被集成使用,从而解决药物靶标发现、诊断标志物发现等精准医学研究目标所亟需的大规模数据整合需求。为验证该方法的可行性,本文选取了四种疾病,新冠、结核、白血病和肺病,包括血液转录组和胸部X光片数据,且这些数据存在普遍的异质性和对照分布不均匀等问题,对这些数据进行此群体学习的分析。通过将数据分散到不同网络节点,并让这些节点动态加入计算,最终实现对这些疾病的识别或亚型鉴定,并与传统机器学习方法结果进行对比。本文最近在Nature Reviews Immunology的一篇comment上被再次提及,并介绍了其白血病临床诊断已被欧盟成功标准化并随后商业化,进一步验证了该方法的实际价值。同时,由于它以“共享见解而非共享数据(sharing insights, not data)”的方式进行协作,对于当下诸如医学免疫学等复杂研究,也将起到更大作用。
IF:50.500Q1 Nature, 2021-06. DOI: 10.1038/s41586-021-03583-3 PMID: 34040261 PMCID:PMC8189907
Abstract:
Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine. Patients with leukaemia can be identified using machine learning on the basis … >>>
Fast and reliable detection of patients with severe and heterogeneous illnesses is a major goal of precision medicine. Patients with leukaemia can be identified using machine learning on the basis of their blood transcriptomes. However, there is an increasing divide between what is technically possible and what is allowed, because of privacy legislation. Here, to facilitate the integration of any medical data from any data owner worldwide without violating privacy laws, we introduce Swarm Learning-a decentralized machine-learning approach that unites edge computing, blockchain-based peer-to-peer networking and coordination while maintaining confidentiality without the need for a central coordinator, thereby going beyond federated learning. To illustrate the feasibility of using Swarm Learning to develop disease classifiers using distributed data, we chose four use cases of heterogeneous diseases (COVID-19, tuberculosis, leukaemia and lung pathologies). With more than 16,400 blood transcriptomes derived from 127 clinical studies with non-uniform distributions of cases and controls and substantial study biases, as well as more than 95,000 chest X-ray images, we show that Swarm Learning classifiers outperform those developed at individual sites. In addition, Swarm Learning completely fulfils local confidentiality regulations by design. We believe that this approach will notably accelerate the introduction of precision medicine. <<<
翻译
117.
颜林林 (2022-05-29 23:45):
#paper doi:10.1016/j.ccell.2022.05.005 Cancer Cell, 2022, Redefining breast cancer subtypes to guide treatment prioritization and maximize response: Predictive biomarkers across 10 cancer therapies. 这篇论文是关于一项正在进行的、开放标签、适应性、随机II期、多中心的临床试验(NCT01042379)I-SPY2。该临床试验入组高危II期和III期乳腺癌患者,并用70基因MammaPrint测试来排除其中无法从化疗获益的患者,将她们随机分配到10个不同用药组,并在手术前的新辅助治疗期间的不同时间点,进行MRI检查、穿刺取样和/或外周血采集,新辅助治疗前的穿刺样本,同时开展了基因表达芯片、蛋白磷酸化和免疫组化/原位杂交的检测。通过这些检测数据和患者用药响应结果,本研究对入组的987例患者做了重新分类,定义出五种亚型:HER2-/Immune-/DRD-、HER2-/Immune+、HER2-/Immune-/DRD+、HER2+/BP-HER2_or_Basal及HER2+/BP-Luminal。这个重定义过程,除了纳入传统定义所采用的HR/HER2状态外,也包含了诸如增值、DRD、免疫等其他表型特征,在确保分型区分选取最佳治疗方案,即获得最高的pCR(病理完全缓解)的概率,同时也兼顾检测平台稳健性和临床实施简单。虽然目前每个单臂上的病例数还并不算特别多,但相信随着该临床试验的持续开展和更多病例数据的积累,这项研究将优化出相比当前指南建议更好的治疗决策路径。而相应的研究方法和数据分析套路,也预期可以套用到其他癌种上,并在各类高通量多组学检测方法快速发展的今天,持续产出更多精准医疗实践应用。
IF:48.800Q1 Cancer cell, 2022-06-13. DOI: 10.1016/j.ccell.2022.05.005 PMID: 35623341 PMCID:PMC9426306
Abstract:
Using pre-treatment gene expression, protein/phosphoprotein, and clinical data from the I-SPY2 neoadjuvant platform trial (NCT01042379), we create alternative breast cancer subtypes incorporating tumor biology beyond clinical hormone receptor (HR) and … >>>
Using pre-treatment gene expression, protein/phosphoprotein, and clinical data from the I-SPY2 neoadjuvant platform trial (NCT01042379), we create alternative breast cancer subtypes incorporating tumor biology beyond clinical hormone receptor (HR) and human epidermal growth factor receptor-2 (HER2) status to better predict drug responses. We assess the predictive performance of mechanism-of-action biomarkers from ∼990 patients treated with 10 regimens targeting diverse biology. We explore >11 subtyping schemas and identify treatment-subtype pairs maximizing the pathologic complete response (pCR) rate over the population. The best performing schemas incorporate Immune, DNA repair, and HER2/Luminal phenotypes. Subsequent treatment allocation increases the overall pCR rate to 63% from 51% using HR/HER2-based treatment selection. pCR gains from reclassification and improved patient selection are highest in HR subsets (>15%). As new treatments are introduced, the subtyping schema determines the minimum response needed to show efficacy. This data platform provides an unprecedented resource and supports the usage of response-based subtypes to guide future treatment prioritization. <<<
翻译
118.
颜林林 (2022-04-30 18:41):
#paper doi:10.1016/j.ccell.2022.04.002 Cancer Cell, 2022, The translational challenges of precision oncology. 这是一篇新近发表在Cancer Cell上的关于精准肿瘤学(precision oncology)的综述。所谓精准肿瘤学,是指基于肿瘤分子特征进行肿瘤诊治决策。这篇综述回顾了与肿瘤分子特征相关的研究历史和当前研究进展,从肿瘤发生、肿瘤预防、早期检测、新辅助治疗、微小病变残留监测、药物耐受、肿瘤演化过程、肿瘤转移等诊治不同阶段环节,讨论了相应重要分子特征的发现及应用。本文对于目前在肿瘤基因检测行业中涉及到的各类应用,包括涉及的临床队列研究和相关资源,都有提及,整体上内容全面、逻辑脉络清晰。比较适合初学者,快速了解这个方向的产业应用和临床应用,并强烈建议可追溯其参考文献,对各个具体应用场景,进行深入探索和学习。
IF:48.800Q1 Cancer cell, 2022-05-09. DOI: 10.1016/j.ccell.2022.04.002 PMID: 35487215
Abstract:
The translational challenges in the field of precision oncology are in part related to the biological complexity and diversity of this disease. Technological advances in genomics have facilitated large sequencing … >>>
The translational challenges in the field of precision oncology are in part related to the biological complexity and diversity of this disease. Technological advances in genomics have facilitated large sequencing efforts and discoveries that have further supported this notion. In this review, we reflect on the impact of these discoveries on our understanding of several concepts: cancer initiation, cancer prevention, early detection, adjuvant therapy and minimal residual disease monitoring, cancer drug resistance, and cancer evolution in metastasis. We discuss key areas of focus for improving cancer outcomes, from biological insights to clinical application, and suggest where the development of these technologies will lead us. Finally, we discuss practical challenges to the wider adoption of molecular profiling in the clinic and the need for robust translational infrastructure. <<<
翻译
119.
颜林林 (2022-03-20 16:16):
#paper doi:10.1101/2022.03.14.22272390 medRxiv, 2022, AI-Augmented Clinical Decision Support in a Patient-Centric Precision Oncology Registry. 这篇文章介绍了一项由人工智能技术辅助开展肿瘤精准诊疗的工作。文章作者来自xCures公司(成立于2018年),通讯作者Jeff Shrager是该公司的创始人,同时也是斯坦福的客座教授。实现类似目的的医学专家系统,至少可以追溯至上世纪七十年代,到如今即使看似AI技术早已渗透至医疗诸多领域,但却依然缺乏卓有成效的医疗决策通用解决方案,随着IBM Watson折戟,更让人们意识到这个伟大梦想所遭遇的种种现实困难。本文介绍了xCures公司开展的一项临床试验(NCT03793088),该临床试验建立了一个在线平台XCELSIOR,用于登记癌症患者信息,使用NLP等技术对患者病历数据进行格式化和标准化,号称“以患者为中心”,应该是形成类似于患者大健康病历记录,再结合各类公共数据库资源及其他信息,形成针对患者的可选诊疗方案的推荐及排序,并通过由分子药理学家和肿瘤专家组成的虚拟肿瘤委员会 (VTB) 团队进行人工审查,将结果提供给医生和患者,用于指导后续治疗方案决策。该临床试验面向难治性或晚期癌症患者,预计入组1万人(起始于2019年,预计2024年完成),目前已入组2千多人,并以每周约15例患者的速度在持续。其目标正如该公司官网上宣称的“uses artificial intelligence (A.I.) and predictive modeling to identify and rank the most promising treatment options for people with cancer who have exhausted the standard of care”。我个人相信,类似工作在全球各地肯定并非屈指可数,这些工作在未来必然会体现出难以估量的价值,但价值究竟如何体现,目前尚不明朗,即使这篇文章也未呈现出其独特优势所在,临床试验的评判终点也语焉不详,更多信息还有待继续观察。不过从本文及其补充材料的详细介绍看,该工作的工程意义更胜于科学意义,而这篇preprint更多可能是宣传价值。该工作是否会给其公司形成足够盈利也很难说,但能够以正式临床试验的形式开展,看似认真地建立体系并执行,其中细节应该也还是值得关注和学习的。
120.
颜林林 (2022-03-06 20:48):
#paper doi:10.1101/2021.07.19.452956, bioRxiv, 2022, The Tabula Sapiens: a multiple organ single cell transcriptomic atlas of humans. 这是一篇preprint,介绍了对于单细胞转录组测序而言非常重磅的一项资源。它纳入了15位捐赠者(一般由于中风、外伤或缺氧等导致死亡,参见:https://tabula-sapiens-portal.ds.czbiohub.org/whereisthedata)所提供的24个不同组织器官,分离得到将近50万个单细胞,分别进行了10x和/或SmartSeq2的单细胞转录组测序技术,分析得到400多种细胞类型的组织特异性表达数据,提供了组织间T细胞克隆分布、B细胞组织特异性突变率、细胞周期状态及不同细胞在组织器官之间的分布、个体不同组织间细胞类型特异性RNA剪接形式等重要参考基准图谱信息。同时,通过对样本进行病理切片和H&E染色等分析,也将转录组数据与宏观临床相关信息,如不同组织类型的空间异质性、细胞相对丰度估计等都做了关联和讨论。这个项目由 Tabula Sapiens Consortium 执行,其数据(包括原始测序数据和分析结果)存放在AWS、FigShare、CellXGene等平台,供全世界开放使用(但不允许在未征得该委员会及合作方同意前发表图谱或组织规模的数据分析文章),相关信息可在项目网站(https://tabula-sapiens-portal.ds.czbiohub.org/)上找到,该网站还提供了一套流程,帮助用户使用其结果来注释和解读自己的数据。有两点很值得一提:一、该委员会及项目主要由 Chan Zuckerberg Initiative 基金会支持,该基金会由 Facebook创始人马克·扎克伯格及其妻子普莉希拉·陈(生物学专业)共同创办,bioRxiv和medRxiv也是由该基金会支持建立和维持运营;二、这篇文章的通讯作者Stephen R Quake,是生物技术领域的超级大牛,他也应该是在很早期将自己基因组贡献出来验证相关高通量测序技术的名人之一,可参见2009年NBT文章(doi:10.1038/nbt.1561),该文章的受试者P0(猜测很可能就是Quake本人),基于已成为历史的Helicos Biosciences公司的单分子高通量测序技术(应该属于三代测序体系;要知道,二代测序的兴起,也仅仅开始于2008年左右),测定了该技术的最早人全基因组数据。Quake的贡献及事迹这里不做展开,有兴趣者可自行搜索。
回到顶部