白鸟 (2026-01-29 16:02):
#paper DOI:10.1126/science.ads9530 文献名称:Deep contrastive learning enables genome-wide virtual screening 发表期刊:Science, 2026 文章概要:DrugCLIP模型,基于深度对比学习的框架,用于实现超大规模、超快速的全基因组虚拟筛选。核心问题:传统分子对接计算量巨大,无法高效处理全人类基因组(约10,000+个蛋白靶点)× 海量化合物库(如500百万分子)的组合(万亿级交互)。 算法思路:通过对比学习,将蛋白pocket(结合位点)和小分子嵌入到一个共享的潜在空间中。在这个空间里,相似度直接编码蛋白-分子结合的可能性,实现了开创性的万亿级全基因组筛选,是后AlphaFold时代的新范式,推动从“靶点-化合物”的一对一筛选转向“全基因组-全化学空间”的系统性探索。 亮点 速度极快:DrugCLIP可在一天内完成万亿级交互,真正实现全基因组规模。 准确性强:在多种基准上表现出色,EF1%(top 1%富集因子)等指标领先;支持多靶点筛选和泛化。 可解释性:嵌入空间可视化(t-SNE等)能直观展示蛋白-分子匹配模式。 开放性高:作者公开了大规模筛选数据库,研究者可直接查询/下载结果;早期版本代码已开源(NeurIPS 2023 DrugCLIP仓库)。 部署和应用 1.在线版本:提交输入文件,即可生成结果 2.GitHub开源版本:早期版本开源,可python调用; 局限性:结构依赖、计算资源和实验验证
Deep contrastive learning enables genome-wide virtual screening
Yinjun Jia, Bowen Gao, Jiaxin Tan, Jiqing Zheng, Xin Hong, Wenyu Zhu, Haichuan Tan, Yuan Xiao, Liping Tan, Hongyi Cai, ... >>>
Yinjun Jia, Bowen Gao, Jiaxin Tan, Jiqing Zheng, Xin Hong, Wenyu Zhu, Haichuan Tan, Yuan Xiao, Liping Tan, Hongyi Cai, Yanwen Huang, Zhiheng Deng, Xiangwei Wu, Yue Jin, Yafei Yuan, Jiekang Tian, Wei He, Weiying Ma, Yaqin Zhang, Lei Liu, Chuangye Yan, Wei Zhang, Yanyan Lan <<<
Abstract:
Recent breakthroughs in protein structure prediction have opened new avenues for genome-wide drug discovery, yet existing virtual screening methods remain computationally prohibitive. We present DrugCLIP, a contrastive learning framework that achieves ultrafast and accurate virtual screening, up to 10 million times faster than docking, while consistently outperforming various baselines on in silico benchmarks. In wet-lab validations, DrugCLIP achieved a 15% hit rate for norepinephrine transporter, and structures of two identified inhibitors were determined in complex with the target protein. For thyroid hormone receptor interactor 12, a target that lacks holo structures and small-molecule binders, DrugCLIP achieved a 17.5% hit rate using only AlphaFold2-predicted structures. Finally, we released GenomeScreenDB, an open-access database providing precomputed results for ~10,000 human proteins screened against 500 million compounds, pioneering a drug discovery paradigm in the post-AlphaFold era.
回到顶部