白鸟 (2026-01-29 16:02):
#paper DOI:10.1126/science.ads9530 文献名称:Deep contrastive learning enables genome-wide virtual screening 发表期刊:Science, 2026 文章概要:DrugCLIP模型,基于深度对比学习的框架,用于实现超大规模、超快速的全基因组虚拟筛选。核心问题:传统分子对接计算量巨大,无法高效处理全人类基因组(约10,000+个蛋白靶点)× 海量化合物库(如500百万分子)的组合(万亿级交互)。 算法思路:通过对比学习,将蛋白pocket(结合位点)和小分子嵌入到一个共享的潜在空间中。在这个空间里,相似度直接编码蛋白-分子结合的可能性,实现了开创性的万亿级全基因组筛选,是后AlphaFold时代的新范式,推动从“靶点-化合物”的一对一筛选转向“全基因组-全化学空间”的系统性探索。 亮点 速度极快:DrugCLIP可在一天内完成万亿级交互,真正实现全基因组规模。 准确性强:在多种基准上表现出色,EF1%(top 1%富集因子)等指标领先;支持多靶点筛选和泛化。 可解释性:嵌入空间可视化(t-SNE等)能直观展示蛋白-分子匹配模式。 开放性高:作者公开了大规模筛选数据库,研究者可直接查询/下载结果;早期版本代码已开源(NeurIPS 2023 DrugCLIP仓库)。 部署和应用 1.在线版本:提交输入文件,即可生成结果 2.GitHub开源版本:早期版本开源,可python调用; 局限性:结构依赖、计算资源和实验验证
Deep contrastive learning enables genome-wide virtual screening
翻译
Abstract:
Recent breakthroughs in protein structure prediction have opened new avenues for genome-wide drug discovery, yet existing virtual screening methods remain computationally prohibitive. We present DrugCLIP, a contrastive learning framework that achieves ultrafast and accurate virtual screening, up to 10 million times faster than docking, while consistently outperforming various baselines on in silico benchmarks. In wet-lab validations, DrugCLIP achieved a 15% hit rate for norepinephrine transporter, and structures of two identified inhibitors were determined in complex with the target protein. For thyroid hormone receptor interactor 12, a target that lacks holo structures and small-molecule binders, DrugCLIP achieved a 17.5% hit rate using only AlphaFold2-predicted structures. Finally, we released GenomeScreenDB, an open-access database providing precomputed results for ~10,000 human proteins screened against 500 million compounds, pioneering a drug discovery paradigm in the post-AlphaFold era.
翻译
回到顶部