当前共找到 23 篇文献分享,本页显示第 21 - 23 篇。
21.
颜林林
(2025-02-24 21:06):
#paper doi:10.1038/s41588-024-02050-9, Nature Genetics, 2025, Cell state-dependent allelic effects and contextual Mendelian randomization analysis for human brain phenotypes. 这篇是今年1月份新发表在Nature Genetics的文章,对391例人脑(208患者 vs. 183对照,死后的组织样本)进行snRNA-seq(单核测序)和SNP芯片检测,单核测序能够分析得到不同细胞类型的每个基因的表达量,于是可以鉴别出特定细胞的eQTL,即只在某个细胞类型中才会对基因表达量产生影响的那些突变。这个研究逻辑(鉴别特定细胞的eQTL),在此之前已经有不止一篇文章做过了。本文的重要创新点在于,构建了三个模型(M0、M1、M2),分别表示用临床信息协变量、协变量+基因型、协变量+基因型x疾病来预测表达量,接着,M1 对 M0,M2 对 M1 分别做似然比检验(likelihood ratio test),可以筛选出那些仅影响基因表达量但不直接影响疾病表型的突变,这正好用于后续的孟德尔随机化分析,从而在基因(表达量)与表型之间建立起因果关系(而不仅仅是相关关系)。之后文章还使用大规模的蛋白组数据,在蛋白水平进行了相应验证。
Nature Genetics,
2025-2.
DOI: 10.1038/s41588-024-02050-9
Abstract:
Abstract Gene expression quantitative trait loci are widely used to infer relationships between genes and central nervous system (CNS) phenotypes; however, the effect of brain disease on these inferences is …
>>>
Abstract Gene expression quantitative trait loci are widely used to infer relationships between genes and central nervous system (CNS) phenotypes; however, the effect of brain disease on these inferences is unclear. Using 2,348,438 single-nuclei profiles from 391 disease-case and control brains, we report 13,939 genes whose expression correlated with genetic variation, of which 16.7–40.8% (depending on cell type) showed disease-dependent allelic effects. Across 501 colocalizations for 30 CNS traits, 23.6% had a disease dependency, even after adjusting for disease status. To estimate the unconfounded effect of genes on outcomes, we repeated the analysis using nondiseased brains (n = 183) and reported an additional 91 colocalizations not present in the larger mixed disease and control dataset, demonstrating enhanced interpretation of disease-associated variants. Principled implementation of single-cell Mendelian randomization in control-only brains identified 140 putatively causal gene–trait associations, of which 11 were replicated in the UK Biobank, prioritizing candidate peripheral biomarkers predictive of CNS outcomes.
<<<
翻译
22.
惊鸿
(2025-02-15 00:02):
#paper DeepSeek-V2: A Strong, Economical, and Efficient Mixture-of-Experts Language Model
Pub Date : 2024-05-07
DOI : arxiv-2405.04434
我们提出了 DeepSeek-V2,一种强大的专家混合 (MoE) 语言模型,其特点是经济的训练和高效的推理。它总共包括236B个参数,其中每个令牌激活21B个参数,并支持128K令牌的上下文长度。 DeepSeek-V2采用多头潜在注意力(MLA)和DeepSeekMoE等创新架构。 MLA 通过将键值 (KV) 缓存显着压缩为潜在向量来保证高效推理,而 DeepSeekMoE 则可以通过稀疏计算以经济的成本训练强大的模型。与 DeepSeek 67B 相比,DeepSeek-V2 性能显着增强,同时节省了 42.5% 的训练成本,减少了 93.3% 的 KV 缓存,最大生成吞吐量提升至 5.76 倍。我们在由 8.1T 代币组成的高质量多源语料库上对 DeepSeek-V2 进行预训练,并进一步进行监督微调(SFT)和强化学习(RL)以充分释放其潜力。评估结果表明,即使只有21B个激活参数,DeepSeek-V2及其聊天版本仍然达到了开源模型中顶级的性能。模型检查点位于“https://github.com/deepseek-ai/DeepSeek-V2”。
arXiv,
2024-05-07T15:56:43Z.
DOI: 10.48550/arXiv.2405.04434
Abstract:
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language modelcharacterized by economical training and efficient inference. It comprises 236Btotal parameters, of which 21B are activated for each token, and supports acontext …
>>>
We present DeepSeek-V2, a strong Mixture-of-Experts (MoE) language modelcharacterized by economical training and efficient inference. It comprises 236Btotal parameters, of which 21B are activated for each token, and supports acontext length of 128K tokens. DeepSeek-V2 adopts innovative architecturesincluding Multi-head Latent Attention (MLA) and DeepSeekMoE. MLA guaranteesefficient inference through significantly compressing the Key-Value (KV) cacheinto a latent vector, while DeepSeekMoE enables training strong models at aneconomical cost through sparse computation. Compared with DeepSeek 67B,DeepSeek-V2 achieves significantly stronger performance, and meanwhile saves42.5% of training costs, reduces the KV cache by 93.3%, and boosts the maximumgeneration throughput to 5.76 times. We pretrain DeepSeek-V2 on a high-qualityand multi-source corpus consisting of 8.1T tokens, and further performSupervised Fine-Tuning (SFT) and Reinforcement Learning (RL) to fully unlockits potential. Evaluation results show that, even with only 21B activatedparameters, DeepSeek-V2 and its chat versions still achieve top-tierperformance among open-source models.
<<<
翻译
23.
龙海晨
(2025-02-14 17:36):
#paper Ayub Q, Mezzavilla M, Pagani L, Haber M, Mohyuddin A, Khaliq S, Mehdi SQ, Tyler-Smith C. The Kalash genetic isolate: ancient divergence, drift, and selection. Am J Hum Genet. 2015 May 7;96(5):775-83. doi: 10.1016/j.ajhg.2015.03.012. Epub 2015 Apr 30. PMID: 25937445; PMCID: PMC4570283.这是一篇介绍基因与人种的文献,文章通过研究Kalash
基因研究人种的起源。卡拉什人代表了一个神秘的孤立的印欧语系人群,他们已经在今巴基斯坦的兴都库什山脉生活了几个世纪。在马其顿的亚历山大三世入侵该地区后,先前的 Y 染色体和线粒体 DNA 标记没有找到他们有希腊血统的有利证据。为研究该人种的起源。通过与古代狩猎采集者和欧洲农民的已发表数据进行比较表明,卡拉什人与旧石器时代西伯利亚狩猎采集者共享基因漂移遗传漂变,可能代表了一个极度漂移的古代北欧亚人群。自从从其他南亚种群中分离出来以来,卡拉什人一直保持着较低的长期有效种群规模。,并且没有从他们在巴基斯坦的地理邻居或其他现存的欧亚种群中检测到基因流动。 卡拉什人和目前居住在该地区的其他人群之间的平均分化时间估计为 11,800 年前(95% 置信区间 = 10,600−12,600年前)。基因分析表明他们代表了一些最早从西亚进入印度次大陆的移民的后代。
Abstract:
No abstract available.