孤舟蓑笠翁 (2025-09-11 16:58):
#paper 【doi】10.1126/science.adm7066;【发表年份】2025年;【期刊】Science;【标题】Machine learning–based penetrance of genetic variants。【内容总结】作者想解决“同一个基因突变有人得病、有人不得病”的难题,用传统“有病/没病”二分法算外显率常因样本小、偏倚大而失真,于是把 134 万例纽约 Mount Sinai 医院常规体检的血检、血压等 47 项实验室指标和年龄、性别、BMI 喂给极端梯度提升树模型,为 10 种常染色体显性遗传病(如家族性高胆固醇血症、肥厚型心肌病、多囊肾等)各训练出“疾病分数”——一个 0 到 1 的小数,越接近 1 表示模型越确信这人“现在已处于疾病状态”,越接近 0 则越健康;再把模型套在 2.9 万例有外显子测序的 BioMe 独立队列,给 31 个基因里的 1648 个罕见突变(143 个已判致病、96 个良性、1181 个意义不明 VUS、228 个功能缺失 LoF)算出“ML 外显率”,即携带者的平均疾病分数转化为得病概率;结果致病突变 ML 外显率中位数 0.52 远高于良性 0.28,VUS 居中约 0.46,LoF 与致病突变相仿;ML 外显率越高,携带者越早在血检或影像上出现异常,如高外显率 PKD 突变者肾过滤率平均低 40 mL/min,高外显率 FH 突变者 LDL 高 119 mg/dL,且与体外实验测得的 BRCA1 修复能力下降、LDLR 摄取 LDL 能力下降一致;相比传统“有病/没病”外显率只能取 0、0.5、1 几个离散值,ML 外显率给出 0–1 之间的小数,把 20% 因无法确诊而被传统法扔掉的突变也纳入评估,还能把 66 个高外显率 VUS 和 48 个高外显率 LoF 挑出来,其携带者多年随访确实出现相应器官损伤;作者又在英国生物银行复制出趋势,说明方法可迁移;整个流程只用医院常规化验单,不额外花钱,为遗传咨询提供量化、个体化的风险数字,也能帮实验室优先验证真正致病的 VUS。
Science, 2025-8-28. DOI: 10.1126/science.adm7066
Machine learning–based penetrance of genetic variants
翻译
Abstract:
Accurate variant penetrance estimation is crucial for precision medicine. We constructed machine learning (ML) models for 10 diseases using 1,347,298 participants with electronic health records, then applied them to an independent cohort with linked exome data. Resulting probabilities were used to evaluate ML penetrance of 1648 rare variants in 31 autosomal dominant disease-predisposition genes. ML penetrance was variable across variant classes, but highest for pathogenic and loss-of-function variants, and was associated with clinical outcomes and functional data. Compared with conventional case-versus-control approaches, ML penetrance provided refined quantitative estimates and aided the interpretation of variants of uncertain significance and loss-of-function variants by delineating clinical trajectories over time. By leveraging ML and deep phenotyping, we present a scalable approach to accurately quantify disease risk of variants.
翻译
回到顶部