来自用户 翁凯 的文献。
当前共找到 11 篇文献分享。
1.
翁凯
(2025-03-05 21:45):
#paper 【doi】10.1038/s41586-024-07954-4;【发表年份】2024年;【期刊】Nature;【标题】Temporal recording of mammalian development and precancer。该研究旨在突破传统细胞事件追踪技术的局限性,通过开发基于CRISPR的单细胞分子钟平台,实现哺乳动物发育和肿瘤起源的精准时空记录。研究者利用自突变CRISPR条形码技术,在单细胞水平同步捕获基因表达和遗传变异信息,系统解析了小鼠胚胎器官形成过程中细胞增殖、分化和克隆动态,揭示出肠道发育中未被识别的新型祖细胞群体及其功能特征。进一步将此技术应用于人类结直肠癌前病变样本(包含116个息肉的转录组和418个息肉的突变组数据),首次证实约15-30%的腺瘤起源于多个独立正常干细胞,挑战了传统单克隆致癌假说,为癌症早期起源机制研究提供了多维证据支持。
2.
翁凯
(2025-03-02 17:41):
#paper doi:10.1038/s41587-024-02248-6;发表年份:2024;期刊:Nature Biotechnology;标题:High-throughput discovery of MHC class I- and II-restricted T cell epitopes using synthetic cellular circuits。传统抗原检测技术依赖人类原代T细胞,只能识别少数MHC类型(如人类MHC I类),且无法高效分析低亲和力抗原或跨物种(如小鼠)模型。为了解决这些问题,本研究开发了名为TCR-MAP的新技术,其核心是通过基因工程改造Jurkat细胞(一种实验室常用的T细胞系),使其携带特定T细胞受体(TCR)和一个名为Sortase A的酶;当TCR识别到抗原呈递细胞(如病毒感染的细胞或肿瘤细胞)表面的抗原肽-MHC复合物时,Sortase A会被激活,并在靶细胞表面打上生物素“标记”,随后通过磁珠富集这些标记细胞并测序解析抗原。实验证明,该技术能同时兼容人类和小鼠的MHC I/II类抗原,成功识别了CMV病毒抗原、肿瘤抗原(如CTAG1B)以及自身免疫疾病相关抗原(如心脏中的CKMT2),检测灵敏度达到微摩尔级(可发现极微量的抗原),且无需依赖不稳定的原代T细胞。这一平台为病毒逃逸研究、肿瘤疫苗开发和自身免疫病机制解析提供了高效工具,未来还可扩展至脂类等非蛋白抗原的检测。
Nature Biotechnology,
2024-7-2.
DOI: 10.1038/s41587-024-02248-6
Abstract:
AbstractAntigen discovery technologies have largely focused on major histocompatibility complex (MHC) class I-restricted human T cell receptors (TCRs), leaving methods for MHC class II-restricted and mouse TCR reactivities relatively undeveloped. …
>>>
AbstractAntigen discovery technologies have largely focused on major histocompatibility complex (MHC) class I-restricted human T cell receptors (TCRs), leaving methods for MHC class II-restricted and mouse TCR reactivities relatively undeveloped. Here we present TCR mapping of antigenic peptides (TCR-MAP), an antigen discovery method that uses a synthetic TCR-stimulated circuit in immortalized T cells to activate sortase-mediated tagging of engineered antigen-presenting cells (APCs) expressing processed peptides on MHCs. Live, tagged APCs can be directly purified for deconvolution by sequencing, enabling TCRs with unknown specificity to be queried against barcoded peptide libraries in a pooled screening context. TCR-MAP accurately captures self-reactivities or viral reactivities with high throughput and sensitivity for both MHC class I-restricted and class II-restricted TCRs. We elucidate problematic cross-reactivities of clinical TCRs targeting the cancer/testis melanoma-associated antigen A3 and discover targets of myocarditis-inciting autoreactive T cells in mice. TCR-MAP has the potential to accelerate T cell antigen discovery efforts in the context of cancer, infectious disease and autoimmunity.
<<<
翻译
3.
翁凯
(2025-02-28 21:55):
#paper 10.1038/s41586-025-08622-x. 2025. Nature. Comparative characterization of human accelerated regions in neurons. 这项研究通过比较人类和黑猩猩诱导多能干细胞(iPS细胞)诱导的兴奋性神经元中的HARs,揭示了HARs在人类大脑进化中的潜在作用。研究发现,HAR202在人类神经元中通过改变多个转录因子的结合亲和力来降低NPAS3的表达,而在黑猩猩神经元中,HAR202的同源区域则增强了NPAS3的表达。此外,2xHAR.319在人类神经元中特异性地增强了PUM2的表达,这对于维持iPS细胞的多能性和神经元分化至关重要。敲除2xHAR.319会导致PUM2表达下降,影响细胞的自我更新和分化能力。最后,HAR26;2xHAR.178在人类神经元中通过增强SOCS2的表达来促进神经突起的生长,而在黑猩猩神经元中,这一区域的同源区域则没有这种作用。这些发现为理解HARs在人类大脑进化中的作用提供了新的见解。
4.
翁凯
(2025-01-31 23:36):
#paper 10.1186/s12967-023-04576-8。2023。Harnessing large language models (LLMs) for candidate gene prioritization and selection。该论文探讨了用大语言模型以知识驱动的方式对组学数据得到的一大堆基因进行解读、筛选,从而加速获得临床见解的可行性。结果发现OpenAI的GPT-4和Anthropic的Claude表现最佳。我的一个重要收获是发现对于目前的大语言模型的有效使用不是自己原来想的简单的提问就可以的,而是貌似应该是像完成一个项目分解为小的任务,然后逐步推进、整合额外信息,最后得出结论。这提醒我要想用好目前的大语言模型,需要学习如何提问。
Journal of Translational Medicine,
2023-10-16.
DOI: 10.1186/s12967-023-04576-8
Abstract:
AbstractBackgroundFeature selection is a critical step for translating advances afforded by systems-scale molecular profiling into actionable clinical insights. While data-driven methods are commonly utilized for selecting candidate genes, knowledge-driven methods …
>>>
AbstractBackgroundFeature selection is a critical step for translating advances afforded by systems-scale molecular profiling into actionable clinical insights. While data-driven methods are commonly utilized for selecting candidate genes, knowledge-driven methods must contend with the challenge of efficiently sifting through extensive volumes of biomedical information. This work aimed to assess the utility of large language models (LLMs) for knowledge-driven gene prioritization and selection.MethodsIn this proof of concept, we focused on 11 blood transcriptional modules associated with an Erythroid cells signature. We evaluated four leading LLMs across multiple tasks. Next, we established a workflow leveraging LLMs. The steps consisted of: (1) Selecting one of the 11 modules; (2) Identifying functional convergences among constituent genes using the LLMs; (3) Scoring candidate genes across six criteria capturing the gene’s biological and clinical relevance; (4) Prioritizing candidate genes and summarizing justifications; (5) Fact-checking justifications and identifying supporting references; (6) Selecting a top candidate gene based on validated scoring justifications; and (7) Factoring in transcriptome profiling data to finalize the selection of the top candidate gene.ResultsOf the four LLMs evaluated, OpenAI's GPT-4 and Anthropic's Claude demonstrated the best performance and were chosen for the implementation of the candidate gene prioritization and selection workflow. This workflow was run in parallel for each of the 11 erythroid cell modules by participants in a data mining workshop. Module M9.2 served as an illustrative use case. The 30 candidate genes forming this module were assessed, and the top five scoring genes were identified as BCL2L1, ALAS2, SLC4A1, CA1, and FECH. Researchers carefully fact-checked the summarized scoring justifications, after which the LLMs were prompted to select a top candidate based on this information. GPT-4 initially chose BCL2L1, while Claude selected ALAS2. When transcriptional profiling data from three reference datasets were provided for additional context, GPT-4 revised its initial choice to ALAS2, whereas Claude reaffirmed its original selection for this module.ConclusionsTaken together, our findings highlight the ability of LLMs to prioritize candidate genes with minimal human intervention. This suggests the potential of this technology to boost productivity, especially for tasks that require leveraging extensive biomedical knowledge.
<<<
翻译
5.
翁凯
(2025-01-01 01:23):
#paper DOI: 10.1038/s41467-024-52615-9. Nature Communications, 2024. The estrogen response in fibroblasts promotes ovarian metastases of gastric cancer. 这篇论文探索了女性胃癌往卵巢的转移在绝经前高发的机制。主要利用单细胞转录组技术,作者发现绝经前女性的雌激素和雌激素受体(ER)水平较高,而雌激素在卵巢转移中起到了关键的促进作用。具体来说,卵巢成纤维细胞表达高水平的ER,并且在雌激素的刺激下,这些成纤维细胞分泌Midkine(MDK),而MDK通过与低密度脂蛋白受体相关蛋白1(LRP1)结合,促进胃癌细胞的迁移和侵袭,从而增强了卵巢转移的能力。
6.
翁凯
(2024-05-31 22:29):
#paper doi: 10.1038/s41587-021-01033-z. Differential abundance testing on single-cell data using k-nearest neighbor graphs. 这个研究跳出了对细胞分群的框架,而是从一个细胞的邻居入手,比较组间的细胞比例差异
IF:33.100Q1
Nature biotechnology,
2022-02.
DOI: 10.1038/s41587-021-01033-z
PMID: 34594043
PMCID:PMC7617075
Abstract:
Current computational workflows for comparative analyses of single-cell datasets typically use discrete clusters as input when testing for differential abundance among experimental conditions. However, clusters do not always provide the …
>>>
Current computational workflows for comparative analyses of single-cell datasets typically use discrete clusters as input when testing for differential abundance among experimental conditions. However, clusters do not always provide the appropriate resolution and cannot capture continuous trajectories. Here we present Milo, a scalable statistical framework that performs differential abundance testing by assigning cells to partially overlapping neighborhoods on a k-nearest neighbor graph. Using simulations and single-cell RNA sequencing (scRNA-seq) data, we show that Milo can identify perturbations that are obscured by discretizing cells into clusters, that it maintains false discovery rate control across batch effects and that it outperforms alternative differential abundance testing strategies. Milo identifies the decline of a fate-biased epithelial precursor in the aging mouse thymus and identifies perturbations to multiple lineages in human cirrhotic liver. As Milo is based on a cell-cell similarity structure, it might also be applicable to single-cell data other than scRNA-seq. Milo is provided as an open-source R software package at https://github.com/MarioniLab/miloR .
<<<
翻译
7.
翁凯
(2024-04-30 22:44):
#paper doi:10.1101/2024.03.18.585576,bioRxiv,2024-03-19。Single-cell genomics and regulatory networks for 388 human brains。这个研究首次在人群规模对人脑前额叶区域进行了单细胞核转录组、染色质可及性测序,然后在细胞类型的精度对基因调控网络、细胞通讯网络等方面进行了生理和病理条件下的探究。研究结果可以在项目(brainSCOPE)的官网获取。官网:http://brainscope.psychencode.org。该研究用了388个人的脑。其中333个是该研究产生的,55个是外来的;健康个体有182个,其余有精神分裂症、双相障碍(抑郁狂躁型忧郁症)、自闭症或老年痴呆。388个个体有snRNA-seq数据。59个个体有snATAC-seq数据,其中40个的是snMultiome(对同一个细胞既测转录组又测ATAC)。质控后共280万个细胞核(注释到了28种细胞)。【研究角度及部分主要发现】1,对每种细胞找cis-eQTL和cis调控元件。2,构建细胞类型特异性的基因调控网络和细胞间通信网络,并展示这些网络在衰老和神经精神疾病中的变异。3,探究每种细胞的占比、基因表达、表观遗传和年龄、老年痴呆的关联。用基因表达量构建预测年龄的摸型。发现有6种细胞的转录组有很强的预测能力。4,在每种细胞里构建摸型,用遗传变异预测对细胞、组织的基因表达的影响。模拟基因序列的干绕对基因表达、表型(包括疾病倾向)等下游的影响。【研究的不足或未来研究方向】 1,RNA表达量不能代替蛋白表达量。这在某些脑区尤其突出。2,人去世后的脑组织和活人的脑组织有区别。3,研究更多脑区,以及发育、衰老中的脑区或者类器官。4,整合更多类型的数据,比如成像数据,用于提升预测表型的能力。【应用前景】1,为理解神经精神疾病的分子机制提供了新的视角,有助于发现新的治疗方法。2,通过整合模型(LNCTP),可以从基因型数据中预测个体的细胞类型特异性功能基因表达,为精准医疗提供工具。3,研究结果可用于优先考虑潜在的药物靶点,并模拟特定基因的表达变化,以预测其对疾病表型的潜在影响。4,该研究创建的brainSCOPE资源库可供其他研究者使用,以进一步探索大脑的分子结构和功能。总体而言,这项研究通过大规模的单细胞分析,为理解人类大脑的复杂性、疾病机制和潜在的治疗干预提供了宝贵的资源和新的洞见。
bioRxiv : the preprint server for biology,
2024-Mar-30.
DOI: 10.1101/2024.03.18.585576
PMID: 38562822
Abstract:
Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet, little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly …
>>>
Single-cell genomics is a powerful tool for studying heterogeneous tissues such as the brain. Yet, little is understood about how genetic variants influence cell-level gene expression. Addressing this, we uniformly processed single-nuclei, multi-omics datasets into a resource comprising >2.8M nuclei from the prefrontal cortex across 388 individuals. For 28 cell types, we assessed population-level variation in expression and chromatin across gene families and drug targets. We identified >550K cell-type-specific regulatory elements and >1.4M single-cell expression-quantitative-trait loci, which we used to build cell-type regulatory and cell-to-cell communication networks. These networks manifest cellular changes in aging and neuropsychiatric disorders. We further constructed an integrative model accurately imputing single-cell expression and simulating perturbations; the model prioritized ~250 disease-risk genes and drug targets with associated cell types.
<<<
翻译
8.
翁凯
(2022-07-31 08:58):
#paper doi:10.1016/j.ccell.2015.09.018 Cancer Cell, 2015, RNA-Seq of Tumor-Educated Platelets Enables Blood-Based Pan-Cancer, Multiclass, and Molecular Pathway Cancer Diagnostics. 发现血小板携带的mRNA可以预测(准确率96%)是否患癌,并且可以进一步预测(准确率74%)原发组织。
IF:48.800Q1
Cancer cell,
2015-Nov-09.
DOI: 10.1016/j.ccell.2015.09.018
PMID: 26525104
PMCID:PMC4644263
Abstract:
Tumor-educated blood platelets (TEPs) are implicated as central players in the systemic and local responses to tumor growth, thereby altering their RNA profile. We determined the diagnostic potential of TEPs …
>>>
Tumor-educated blood platelets (TEPs) are implicated as central players in the systemic and local responses to tumor growth, thereby altering their RNA profile. We determined the diagnostic potential of TEPs by mRNA sequencing of 283 platelet samples. We distinguished 228 patients with localized and metastasized tumors from 55 healthy individuals with 96% accuracy. Across six different tumor types, the location of the primary tumor was correctly identified with 71% accuracy. Also, MET or HER2-positive, and mutant KRAS, EGFR, or PIK3CA tumors were accurately distinguished using surrogate TEP mRNA profiles. Our results indicate that blood platelets provide a valuable platform for pan-cancer, multiclass cancer, and companion diagnostics, possibly enabling clinical advances in blood-based "liquid biopsies".
<<<
翻译
9.
翁凯
(2022-06-30 22:15):
#paper 10.1038/s41588-018-0129-5。Nature Genetics。2018。Genetic identification of brain cell types underlying schizophrenia。貌似是第一批利用单细胞转录组来定位遗传学研究积累的候选致病基因富集在哪些细胞类型。这有利于进行更细致的机制研究。
IF:31.700Q1
Nature genetics,
2018-06.
DOI: 10.1038/s41588-018-0129-5
PMID: 29785013
PMCID:PMC6477180
Abstract:
With few exceptions, the marked advances in knowledge about the genetic basis of schizophrenia have not converged on findings that can be confidently used for precise experimental modeling. By applying …
>>>
With few exceptions, the marked advances in knowledge about the genetic basis of schizophrenia have not converged on findings that can be confidently used for precise experimental modeling. By applying knowledge of the cellular taxonomy of the brain from single-cell RNA sequencing, we evaluated whether the genomic loci implicated in schizophrenia map onto specific brain cell types. We found that the common-variant genomic results consistently mapped to pyramidal cells, medium spiny neurons (MSNs) and certain interneurons, but far less consistently to embryonic, progenitor or glial cells. These enrichments were due to sets of genes that were specifically expressed in each of these cell types. We also found that many of the diverse gene sets previously associated with schizophrenia (genes involved in synaptic function, those encoding mRNAs that interact with FMRP, antipsychotic targets, etc.) generally implicated the same brain cell types. Our results suggest a parsimonious explanation: the common-variant genetic results for schizophrenia point at a limited set of neurons, and the gene sets point to the same cells. The genetic risk associated with MSNs did not overlap with that of glutamatergic pyramidal cells and interneurons, suggesting that different cell types have biologically distinct roles in schizophrenia.
<<<
翻译
除了少数例外,关于精神分裂症遗传基础的知识的显着进展并没有集中在可以自信地用于精确实验建模的发现上。通过应用来自单细胞RNA测序的大脑细胞分类学知识,我们评估了与精神分裂症有关的基因组位点是否映射到特定的脑细胞类型。我们发现,常见变异的基因组结果一致地映射到锥体细胞、中棘神经元(MSN)和某些中间神经元,但对胚胎细胞、祖细胞或神经胶质细胞的一致性要低得多。这些富集是由于在这些细胞类型中的每一种中特异性表达的基因集造成的。我们还发现,许多以前与精神分裂症相关的不同基因集(参与突触功能的基因,编码与FMRP相互作用的mRNA,抗精神病靶点等)通常与相同的脑细胞类型有关。我们的研究结果提出了一个简洁的解释:精神分裂症的常见变异遗传结果指向有限的神经元集,而基因集指向相同的细胞。与MSNs相关的遗传风险与谷氨酸能锥体细胞和中间神经元的遗传风险没有重叠,这表明不同的细胞类型在精神分裂症中具有生物学上不同的作用。
10.
翁凯
(2022-05-23 19:00):
#paper 10.1080/01621459.2020.1721245。Journal of the American Statistical Association。2020。The Book of Why: The New Science of Cause and
Effect。 这是对Judea Pearl的《the book of why》的书评。从这个书评来看,Judea Pearl的《the book of why》有较大的局限。比如,Judea Pearl在《the book of why》只处理了因果分析,而忽略了因果结构的确定。但有时往往连因果结构也是不清楚或者不确定的。基于我的阅读理解,这个书评还指出,Judea Pearl认为随机实验不重要,只要看起来没有受到干扰就可以了。但书评作者认为,随机实验的作用在于能让研究者对实验设计和过程做检查。另外,实验允许我们不知道因果结构。然后,书评作者认为Judea Pearl的因果分析模型的表达能力不够强;还说Judea Pearl虽然回顾了因果研究的历史,但他的回顾是不完整的,忽略了其它因果研究方向;说Judea Pearl认为不接受他的理论的研究者是“文化抵触”,但其实是因为他的理论用处不大;说Judea Pearl的理论和之前Robin的因果理论关系密切,仅仅是多了一些独立性假设,但Robin没提这些假设不是因为他提不出来,而是因为认为太牵强,而且也无法得到实验验证。看了这个书评后我估计不会优先看The Book of Why了。
IF:3.000Q1
Journal of the American Statistical Association,
2020.
DOI: 10.1080/01621459.2020.1721245
Abstract:
No abstract available.
11.
翁凯
(2022-04-30 23:23):
#paper DOI: 10.1126/science.1192788 science, 2011, How to Grow a Mind: Statistics, Structure, and Abstraction. 这是一篇综述,提出了在我看来比较可信的关于人脑如何学习的解释。人脑学习的一个特点是只需少量样本量(或者说数据很稀疏)就能学得很好,尤其是对因果关联的学习。作者认为学习效率高是因为用了抽象知识指导学习,并认为贝叶斯定理能很好地解释是如何用抽象知识指导学习的。而且贝叶斯方法可以有效利用多种形式的抽象知识,从而避免了传统方法需要穷举各种可能(一个个很长的数值向量)的需要。至于是如何从数据学到抽象知识的,比如是如何知道哪种形式是正确的,作者提到了各种形式(树、空间、环、次序……)都可以用graph表示,然后可以用分层贝叶斯模型来生成所需的graph,并且非参形式的分层贝叶斯模型自动蕴含了奥卡姆剃刀,只在数据需要时引入更多变量。不过,有些重要问题仍然没有被分层贝叶斯模型解决,比如学习到底是如何开始的?总得有什么作为基础吧?作者指出,有些贝叶斯建模者认为哪怕是最抽象的概念(比如因果关系的概念)原则上也是可以被学习的。作者还有一些讨论,比如什么Turing complete compositional representations,还有人脑具体如何实现贝叶斯算法,但目前不是我的兴趣(或者其实更是今晚我没有时间重新仔细看了……虽然2011年这篇文献出来的时候我就读过)。有兴趣的朋友可以直接找文献看。
Abstract:
In coming to understand the world-in learning concepts, acquiring language, and grasping causal relations-our minds make inferences that appear to go far beyond the data available. How do we do …
>>>
In coming to understand the world-in learning concepts, acquiring language, and grasping causal relations-our minds make inferences that appear to go far beyond the data available. How do we do it? This review describes recent approaches to reverse-engineering human learning and cognitive development and, in parallel, engineering more humanlike machine learning systems. Computational models that perform probabilistic inference over hierarchies of flexibly structured representations can address some of the deepest questions about the nature and origins of human thought: How does abstract knowledge guide learning and reasoning from sparse data? What forms does our knowledge take, across different domains and tasks? And how is that abstract knowledge itself acquired?
<<<
翻译