文献收藏与分享平台

白鸟 (2026-01-29 16:02):

#paper DOI:10.1126/science.ads9530 文献名称：Deep contrastive learning enables genome-wide virtual screening 发表期刊：Science, 2026 文章概要：DrugCLIP模型，基于深度对比学习的框架，用于实现超大规模、超快速的全基因组虚拟筛选。核心问题：传统分子对接计算量巨大，无法高效处理全人类基因组（约10,000+个蛋白靶点）× 海量化合物库（如500百万分子）的组合（万亿级交互）。算法思路：通过对比学习，将蛋白pocket（结合位点）和小分子嵌入到一个共享的潜在空间中。在这个空间里，相似度直接编码蛋白-分子结合的可能性，实现了开创性的万亿级全基因组筛选，是后AlphaFold时代的新范式，推动从“靶点-化合物”的一对一筛选转向“全基因组-全化学空间”的系统性探索。亮点速度极快：DrugCLIP可在一天内完成万亿级交互，真正实现全基因组规模。准确性强：在多种基准上表现出色，EF1%（top 1%富集因子）等指标领先；支持多靶点筛选和泛化。可解释性：嵌入空间可视化（t-SNE等）能直观展示蛋白-分子匹配模式。开放性高：作者公开了大规模筛选数据库，研究者可直接查询/下载结果；早期版本代码已开源（NeurIPS 2023 DrugCLIP仓库）。部署和应用 1.在线版本：提交输入文件，即可生成结果 2.GitHub开源版本：早期版本开源，可python调用；局限性：结构依赖、计算资源和实验验证

Science, 2026-1-8. DOI: 10.1126/science.ads9530

Deep contrastive learning enables genome-wide virtual screening

翻译

Yinjun Jia, Bowen Gao, Jiaxin Tan, Jiqing Zheng, Xin Hong, Wenyu Zhu, Haichuan Tan, Yuan Xiao, Liping Tan, Hongyi Cai, ... >>>

Abstract:

Recent breakthroughs in protein structure prediction have opened new avenues for genome-wide drug discovery, yet existing virtual screening methods remain computationally prohibitive. We present DrugCLIP, a contrastive learning framework that achieves ultrafast and accurate virtual screening, up to 10 million times faster than docking, while consistently outperforming various baselines on in silico benchmarks. In wet-lab validations, DrugCLIP achieved a 15% hit rate for norepinephrine transporter, and structures of two identified inhibitors were determined in complex with the target protein. For thyroid hormone receptor interactor 12, a target that lacks holo structures and small-molecule binders, DrugCLIP achieved a 17.5% hit rate using only AlphaFold2-predicted structures. Finally, we released GenomeScreenDB, an open-access database providing precomputed results for ~10,000 human proteins screened against 500 million compounds, pioneering a drug discovery paradigm in the post-AlphaFold era.

翻译

Related Links:

https://doi.org/10.1126/science.ads9530