来自用户 李翛然 的文献。
当前共找到 33 篇文献分享,本页显示第 1 - 20 篇。
1.
李翛然
(2024-10-28 13:54):
#paper Modeling protein-small molecule conformational ensembles with ChemNet doi:10.1101/2024.09.25.614868 baker 又一力作,直接把我们最近正在想的共形几何问题引入蛋白质结构与小分子互作,已经直接实现出来了, 下一步其实就是把这个和Diffusion结合,那么小分子de-novo设计就可以完全自动化了。 baker帮我完成了50%的工作~~~
bioRxiv,
2024-9-25.
DOI: 10.1101/2024.09.25.614868
Abstract:
AbstractModeling the conformational heterogeneity of protein-small molecule systems is an outstanding challenge. We reasoned that while residue level descriptions of biomolecules are efficient for de novo structure prediction, for probing …
>>>
AbstractModeling the conformational heterogeneity of protein-small molecule systems is an outstanding challenge. We reasoned that while residue level descriptions of biomolecules are efficient for de novo structure prediction, for probing heterogeneity of interactions with small molecules in the folded state an entirely atomic level description could have advantages in speed and generality. We developed a graph neural network called ChemNet trained to recapitulate correct atomic positions from partially corrupted input structures from the Cambridge Structural Database and the Protein Data Bank; the nodes of the graph are the atoms in the system. ChemNet accurately generates structures of diverse organic small molecules given knowledge of their atom composition and bonding, and given a description of the larger protein context, and builds up structures of small molecules and protein side chains for protein-small molecule docking. Because ChemNet is rapid and stochastic, ensembles of predictions can be readily generated to map conformational heterogeneity. In enzyme design efforts described here and elsewhere, we find that using ChemNet to assess the accuracy and pre-organization of the designed active sites results in higher success rates and higher activities; we obtain a preorganized retroaldolase with akcat/KMof 11000 M-1min- 1, considerably higher than any pre-deep learning design for this reaction. We anticipate that ChemNet will be widely useful for rapidly generating conformational ensembles of small molecule and small molecule-protein systems, and for designing higher activity preorganized enzymes.
<<<
翻译
2.
李翛然
(2024-09-27 21:35):
#paper doi:10.13345/j.cjb.220582 《工程菌种自动化高通量编辑与筛选研究进展》该论文主要讨论了合成生物学领域中工程菌种的自动化高通量编辑与筛选技术的研究进展。合成生物学通过标准化和模块化生物实验对象、方法、技术和流程,创建自动化与高通量的合成生物铸造模式。
这种模式结合了复杂生物过程与自动化设施,颠覆了传统的劳动密集型研究方式,提高了技术迭代能力,促进了合成生物学的发展和产业化应用。
研究进展:
自动化基因编辑:
论文回顾了天津工业生物技术研究所在自动化高通量编辑与筛选领域的工作进展。
讨论了基因克隆、基因组编辑、编辑序列设计的自动化实现。
介绍了CRISPR/Cas9系统等基因编辑技术在自动化操作中的应用。
高通量筛选技术:
论文分析了流式细胞、液滴微流控、全基因组规模扰动测序等高通量筛选技术。
讨论了这些技术在筛选工程菌株中的应用和效果。
最近在读博,高级制药工程需要读中文论文…………
3.
李翛然
(2024-08-31 14:38):
#paper Development of Free Energy calculation methods for the study of monosaccharidesconformation in computer simulations
Doi:10.3389/fmolb.2021.712085
六元环状单糖的褶皱构象开发新的计算工具来研究和描述在分子动力学模拟里碳水化合物的构象特性。
最重要的问题是力场选择问题,目前力场参数(GROMOS 45a4参数集),不能复现糖成分的偏好构象对葡萄糖构象的研究存在的困难:
无论从实验上(第二流行的构象极其少见的出现)和理论计算模拟上(构象由少数结构主导,导致非遍历性的性能瓶颈
因此加速采样方法比如 metadynamics其中集体变量(CV)和对应坐标系的选择很重要,
要考虑到分子环的非平面和褶皱构象
1. 采用了新的坐标系Cremer-Pole(θ,φ)
2. 采用了新的坐标系Strauss-Pickett(α1,α2,α3)
3. 采用了笛卡尔压缩的Cremer-Pole(qx,qy)
Frontiers in molecular biosciences,
2021.
DOI: 10.3389/fmolb.2021.712085
PMID: 34458321
PMCID:PMC8387144
Abstract:
The grand challenge in structure-based drug design is achieving accurate prediction of binding free energies. Molecular dynamics (MD) simulations enable modeling of conformational changes critical to the binding process, leading …
>>>
The grand challenge in structure-based drug design is achieving accurate prediction of binding free energies. Molecular dynamics (MD) simulations enable modeling of conformational changes critical to the binding process, leading to calculation of thermodynamic quantities involved in estimation of binding affinities. With recent advancements in computing capability and predictive accuracy, MD based virtual screening has progressed from the domain of theoretical attempts to real application in drug development. Approaches including the Molecular Mechanics Poisson Boltzmann Surface Area (MM-PBSA), Linear Interaction Energy (LIE), and alchemical methods have been broadly applied to model molecular recognition for drug discovery and lead optimization. Here we review the varied methodology of these approaches, developments enhancing simulation efficiency and reliability, remaining challenges hindering predictive performance, and applications to problems in the fields of medicine and biochemistry.
<<<
翻译
4.
李翛然
(2024-07-30 20:10):
#paper DOI:10.1101/2023.08.08.552403 Identifying small-molecules binding sites in RNA conformational ensembles with SHAMAN 这篇文章怎么说呢,一看就是搞计算机人写的。我来说说为啥。
介绍了一种名为SHAMAN的计算技术,可以识别RNA结构集合中的潜在小分子结合位点。与依赖静态结构的其他计算工具不同,SHAMAN旨在解决RNA分子动态性带来的挑战。该技术通过分析RNA结构的构象集合,而不仅仅是单一静态结构,来识别潜在的结合位点。这种方法对于理解小分子与RNA柔性和动态性之间的相互作用特别有用。
这里面的关键点,是RNA的构象如何确定的,但是他是使用这个方法确定rna构象的:
1.使用分子动力学(MD)模拟来生成RNA的构象集合。论文中提到使用了Amber力场和TIP3P水模型进行了100 ns的MD模拟。
2.从MD轨迹中提取出具有代表性的RNA构象集合。作者使用了聚类算法来对MD轨迹进行聚类,选择了聚类中心作为代表性构象。
3. 这些代表性构象进行分析,识别小分子可能结合的位点。SHAMAN工具就是用来分析这些构象集合,预测小分子的可能结合位点。
这就很扯了, 用聚类的方法来选取最有可能的rna 结构,这不扯呢么! 邮箱TIP3P水模型就已经是生物容忍的最低限度了,居然在这个状态下模拟rna,然后用数学聚类的方法来选取构想。 有点扯!缺乏 实验室人员的嘲讽~~~哈哈
bioRxiv,
2024-2-28.
DOI: 10.1101/2023.08.08.552403
Abstract:
AbstractThe rational targeting of RNA with small molecules is hampered by our still limited understanding of RNA structural and dynamic properties. Mostin silicotools for binding site identification rely on static …
>>>
AbstractThe rational targeting of RNA with small molecules is hampered by our still limited understanding of RNA structural and dynamic properties. Mostin silicotools for binding site identification rely on static structures and therefore cannot face the challenges posed by the dynamic nature of RNA molecules. Here, we present SHAMAN, a computational technique to identify potential small-molecule binding sites in RNA structural ensembles. SHAMAN enables exploring the conformational landscape of RNA with atomistic molecular dynamics and at the same time identifying RNA pockets in an efficient way with the aid of probes and enhanced-sampling techniques. In our benchmark composed of large, structured riboswitches as well as small, flexible viral RNAs, SHAMAN successfully identified all the experimentally resolved pockets and ranked them among the most favorite probe hotspots. Overall, SHAMAN sets a solid foundation for future drug design efforts targeting RNA with small molecules, effectively addressing the long-standing challenges in the field.
<<<
翻译
5.
李翛然
(2024-06-28 14:45):
#paper: doi.org/10.1080/13543776.2024.2369630 Inhibition of GTPase KRASG12D: a review of patent literature 最近发了篇paper 专利回顾的。我们做了个国产替代, 药物上市太墨迹了,就用ai设计了一个荧光探针试剂盒,以后所有想做KRAS,以及KRAS的多突变药物的,直接买这个试剂盒可以测试药物活性,很方便,对标产品1.6万一盒,我们才6000. 国际上就我们2家。 欢迎大家采购。 核心原理不难,就是把一个对标的有效抑制剂,尾部挂上荧光探针,用AI把linker设计出来,再加一些好合成的条件。 今年这个AI也要发一个paper ,大家别急,带条件生成的ai,也是国际上第一个。另外预告一下,今年我们会用光量子计算机,设计蛋白质~~
IF:5.400Q1
Expert opinion on therapeutic patents,
2024-Aug.
DOI: 10.1080/13543776.2024.2369630
PMID: 38884569
Abstract:
INTRODUCTION: KRAS is a critical oncogenic protein intricately involved in tumor progression, and the difficulty in targeting KRAS has led it to be classified as an 'undruggable target.' Among the …
>>>
INTRODUCTION: KRAS is a critical oncogenic protein intricately involved in tumor progression, and the difficulty in targeting KRAS has led it to be classified as an 'undruggable target.' Among the various KRAS mutations, KRASG12D is highly prevalent and represents a promising therapeutic target, yet there are currently no approved inhibitors for it.AREA COVERED: This review summarizes numerous patents and literature featuring inhibitors or degraders of KRASG12D through searching relevant information in PubMed, SciFinder and Web of Science databases from 2021 to February 2024, providing an overview of the research progress on inhibiting KRASG12D in terms of design strategies, chemical structures, biological activities, and clinical advancements.EXPERT OPINION: Since the approval of AMG510 (Sotorasib), there has been an increasing focus on the inhibition of KRASG12D, leading to numerous reports of related inhibitors and degraders. Among them, MRTX1133, as the first KRASG12D inhibitor to enter clinical trials, has demonstrated excellent tumor suppression in various KRASG12D-bearing human tumor xenograft models. It is important to note, however, that understanding the mechanisms of acquired resistance caused by KRAS inhibition and developing additional combination therapies is crucial. Moreover, seeking covalent inhibition of KRASG12D also holds significant potential.
<<<
翻译
6.
李翛然
(2024-05-30 11:40):
#paper Alpha2beta1 integrin is the major collagen-binding
integrin expressed on human Th17 https://doi.org/10.1002/eji.201040307
这篇论文研究了人类Th17细胞中胶原结合整合素α1β1和α2β1的表达和功能。主要发现如下:
Th17细胞在分化过程中更倾向于上调α2β1整合素(也称为VLA-2),而不是α1β1整合素(VLA-1)。
大多数Th17细胞表达α2整合素亚基,而只有少数表达α1整合素亚基。
Th17细胞通过α2β1整合素粘附于I型和II型胶原,但不粘附于IV型胶原。
α2β1整合素与I型和II型胶原的结合可共刺激人类Th17细胞产生IL-17A、IL-17F和IFN-γ。
我说过很多次,胶原蛋白用作敷料和护肤品根本就不是为了透皮吸收!!!!!!!就是卡在细胞间就能起作用!!!!这帮白痴 ,气死我了!
Abstract:
Growing evidence indicates that collagen-binding integrins are important costimulatory molecules of effector T cells. In this study, we demonstrate that the major collagen-binding integrin expressed by human Th17 cells is …
>>>
Growing evidence indicates that collagen-binding integrins are important costimulatory molecules of effector T cells. In this study, we demonstrate that the major collagen-binding integrin expressed by human Th17 cells is alpha2beta1 (α2β1) or VLA-2, also known as the receptor for collagen I on T cells. Our results show that human naïve CD4(+) T cells cultured under Th17 polarization conditions preferentially upregulate α2β1 integrin rather than α1β1 integrin, which is the receptor for collagen IV on T cells. Double staining analysis for integrin receptors and intracellular IL-17 showed that α2 integrin but not α1 integrin is associated with Th17 cells. Cell adhesion experiments demonstrated that Th17 cells attach to collagen I and collagen II using α2β1 integrin but did not attach to collagen IV. Functional studies revealed that collagens I and II but not collagen IV costimulate the production of IL-17A, IL-17F and IFN-γ by human Th17 cells activated with anti-CD3. These results identify α2β1 integrin as the major collagen receptor expressed on human Th17 cells and suggest that it can be an important costimulatory molecule of Th17 cell responses.
<<<
翻译
7.
李翛然
(2024-04-28 18:09):
#paper doi:10. 1186/s42825-019-0012-x Nature Communication. Quantitative and structural analysis of isotopically labelled natural crosslinks in type I skin collagen using LC-HRMS and SANS 本文介绍了对使用LC-HRMS和SANS对标记同位素的天然交联物在I型皮肤胶原蛋白中进行定量和结构分析的研究。研究重点放在皮肤中的两种主要交联物HLNL和HHMD上,它们被同位素标记并进行分析,以了解它们的结构变化以及与硫酸铬的相互作用。研究强调了开发一种良性交联方法的重要性,以保留胶原蛋白的固有物理特性,特别是在皮革制造行业。主要发现包括确认HLNL和HHMD中各有一个亚胺基,使它们容易在低pH值下降解,并由于极端pH值变化和铬鞣制造导致胶原蛋白的结构变化。本研究使用的分析方法也可应用于研究其他胶原组织中的人工交联,用于生物医学应用。 这个算是人类第一篇弄清楚了胶原蛋白到底有哪些交联键~~所以化学交联的方法基本没戏,还是生物方法吧。~
Abstract:
Abstract Collagen structure in biological tissues imparts its intrinsic physical properties by the formation of several covalent crosslinks. For the first time, two major crosslinks in the skin dihydroxylysinonorleucine (HLNL) …
>>>
Abstract Collagen structure in biological tissues imparts its intrinsic physical properties by the formation of several covalent crosslinks. For the first time, two major crosslinks in the skin dihydroxylysinonorleucine (HLNL) and histidinohydroxymerodesmosine (HHMD), were isotopically labelled and then analysed by liquid-chromatography high-resolution accurate-mass mass spectrometry (LC-HRMS) and small-angle neutron scattering (SANS). The isotopic labelling followed by LC-HRMS confirmed the presence of one imino group in both HLNL and HHMD, making them more susceptible to degrade at low pH. The structural changes in collagen due to extreme changes in the pH and chrome tanning were highlighted by the SANS contrast variation between isotopic labelled and unlabelled crosslinks. This provided a better understanding of the interaction of natural crosslinks with the chromium sulphate in collagen suggesting that the development of a benign crosslinking method can help retain the intrinsic physical properties of the leather. This analytical method can also be applied to study artificial crosslinking in other collagenous tissues for biomedical applications. Graphical abstract
<<<
翻译
8.
李翛然
(2024-03-31 01:07):
#paper doi:doi.org/10.1021/acs.analchem.2c05065 RETURN TO ISSUEPREVARTICLENEXT
Simultaneous Dual-Wavelength Source Raman Spectroscopy with a Handheld Confocal Probe for Analysis of the Chemical Composition of In Vivo Human Skin 介绍了一种便携式共焦拉曼光谱系统,具有同时双波长光源和迷你手持探头,用于分析体内人体皮肤的化学成分。该系统能够同时获取指纹区(450−1750 cm−1)和高波数区(2800−3800 cm−1)的光谱,解决了当前商用CRS系统的局限性。关键点包括创新设计结合671和785 nm激光、精确的拉曼光谱分离算法(PRSSA)用于解耦FP和HW光谱,以及数据采集时间减少超过50%。该系统在快速和超宽带光谱采集方面表现出色,显示了在临床工作流程中整合CRS的潜力。
最近可能搞个拉曼光谱仪做美容
Abstract:
Confocal Raman spectroscopy (CRS) is a powerful tool that has been widely used for biological tissue analysis because of its noninvasive nature, high specificity, and rich biochemical information. However, current …
>>>
Confocal Raman spectroscopy (CRS) is a powerful tool that has been widely used for biological tissue analysis because of its noninvasive nature, high specificity, and rich biochemical information. However, current commercial CRS systems suffer from limited detection regions (450-1750 cm), bulky sizes, nonflexibilities, slow acquisitions by consecutive excitations, and high costs if using a Fourier transform (FT) Raman spectroscopy with an InGaAs detector, which impede their adoption in clinics. In this study, we developed a portable CRS system with a simultaneous dual-wavelength source and a miniaturized handheld probe (120 mm × 60 mm × 50 mm) that can acquire spectra in both fingerprint (FP, 450-1750 cm) and high wavenumber (HW, 2800-3800 cm) regions simultaneously. An innovative design combining 671 and 785 nm lasers for simultaneous excitation through a compact and high-efficiency (>90%) wavelength combiner was implemented. Moreover, to decouple the fused FP and HW spectra, a first-of-its-kind precise Raman spectra separation algorithm (PRSSA) was developed based on the maximum probability (MAP) estimate. The accuracy of spectra separation was greater than 99%, demonstrated in both phantom experiments and human skin measurements. The total data acquisition time was reduced by greater than 50% compared to other CRS systems. The results proved our proposed CRS system and PRSSA's superior capability in fast and ultrawideband spectra acquisition will significantly improve the integration of CRS in the clinical workflow.
<<<
翻译
9.
李翛然
(2024-02-28 18:11):
#paper A computational framework for neural network-based variational Monte Carlo with Forward Laplacian doi: https://doi.org/10.1038/s42256-024-00794-x 北大和字节跳动合作的文章,关注是因为一直在看计算化学领域的一些新进展。字节跳动和北京大学团队共同研究,针对神经网络变分蒙特卡罗(NN-VMC)在处理大规模量子系统时计算成本高的问题。
2. 研究团队创新性地提出了“Forward Laplacian”计算框架,通过前向传播直接高效计算神经网络相关拉普拉斯部分,显著提升NN-VMC计算效率。
3. 他们还设计了名为“LapNet”的高效神经网络结构,利用Forward Laplacian优势,大幅减少了模型训练所需的计算资源。
4. 结合Forward Laplacian和LapNet的NN-VMC方法在多种化学系统中展现出优越的性能,可准确计算绝对能量和相对能量,与实验数据和金标准计算方法吻合度高。
5. 尽管已有显著进步,但团队指出,未来还需要将更多化学和物理知识融入NN-VMC方法中以解决部分应用场景中的差异问题,同时Forward Laplacian有望在更广泛的量子力学及基于神经网络的偏微分方程求解领域发挥作用。
10.
李翛然
(2024-01-30 16:22):
#paper: doi:doi.org/10.1186/s42825-019-0012-x Quantitative and structural analysis of isotopically labelled natural crosslinks in type I skin collagen using LC-HRMS and SANS 这篇文章主要介绍了使用液相色谱-高分辨质谱(LC-HRMS)和小角散射(SANS)技术对I型皮肤胶原蛋白中的同位素标记天然交联物进行定量和结构分析的方法和结果。这项研究对于了解皮肤胶原蛋白的结构和功能具有重要意义。
1. 样品制备:研究使用了同位素标记的I型皮肤胶原蛋白样品,通过特定的实验方法进行制备。
2. 液相色谱-高分辨质谱(LC-HRMS)分析:研究使用LC-HRMS技术对样品中的同位素标记天然交联物进行定量分析。LC-HRMS技术能够提供高分辨率和高灵敏度的分析结果。
3. 小角散射(SANS)分析:研究使用SANS技术对样品中的同位素标记天然交联物进行结构分析。SANS技术能够提供关于样品中交联物的大小、形状和分布等信息。
这篇论文的优势包括:
1. 综合分析方法:研究采用了LC-HRMS和SANS两种不同的分析技术,能够从定量和结构两个方面全面地研究同位素标记天然交联物。
2. 高分辨率和高灵敏度:LC-HRMS技术具有高分辨率和高灵敏度的特点,能够提供准确的定量分析结果。
3. 结构信息获取:SANS技术能够提供关于交联物的结构信息,有助于深入了解其在皮肤胶原蛋白中的分布和作用。
然而,这篇论文也存在一些局限性:
1. 样品限制:研究中使用的是同位素标记的I型皮肤胶原蛋白样品,可能无法完全代表自然状态下的胶原蛋白。
2. 技术限制:虽然LC-HRMS和SANS技术在分析同位素标记天然交联物方面具有优势,但仍然存在一定的局限性,如分析时间较长、设备成本较高等。
3. 结果解释:由于同位素标记天然交联物的复杂性,对于分析结果的解释可能存在一定的挑战,需要进一步的研究和验证。
总体而言,这篇论文通过综合应用LC-HRMS和SANS技术,提供了一种定量和结构分析同位素标记天然交联物的方法,并揭示了其在I型皮肤胶原蛋白中的特征和作用,为进一步研究皮肤胶原蛋白的结构和功能提供了重要的参考。 这文章证是我的研究方向,帮助很大
Abstract:
Abstract Collagen structure in biological tissues imparts its intrinsic physical properties by the formation of several covalent crosslinks. For the first time, two major crosslinks in the skin dihydroxylysinonorleucine (HLNL) …
>>>
Abstract Collagen structure in biological tissues imparts its intrinsic physical properties by the formation of several covalent crosslinks. For the first time, two major crosslinks in the skin dihydroxylysinonorleucine (HLNL) and histidinohydroxymerodesmosine (HHMD), were isotopically labelled and then analysed by liquid-chromatography high-resolution accurate-mass mass spectrometry (LC-HRMS) and small-angle neutron scattering (SANS). The isotopic labelling followed by LC-HRMS confirmed the presence of one imino group in both HLNL and HHMD, making them more susceptible to degrade at low pH. The structural changes in collagen due to extreme changes in the pH and chrome tanning were highlighted by the SANS contrast variation between isotopic labelled and unlabelled crosslinks. This provided a better understanding of the interaction of natural crosslinks with the chromium sulphate in collagen suggesting that the development of a benign crosslinking method can help retain the intrinsic physical properties of the leather. This analytical method can also be applied to study artificial crosslinking in other collagenous tissues for biomedical applications. Graphical abstract
<<<
翻译
11.
李翛然
(2023-12-30 10:31):
#paper doi:10.4161/bioe.28791 Bioengineered collagens: Emerging directions for biomedical materials
1. 胶原蛋白作为生物医学材料的历史和应用。
2. 动物胶原蛋白的局限性及疾病传播风险。
3. 重组胶原蛋白技术的发展,特别是在大肠杆菌中表达的细菌胶原蛋白的特性和潜在应用。
4. 生物工程方法改善胶原蛋白稳定性和功能的可能性。
5. 不同系统(如酵母、昆虫细胞、植物、微生物)用于胶原蛋白的生产。
6. 细菌胶原蛋白的特性、稳定性、非免疫原性和生产方法。
7. 结合生物工程技术,设计出具有特定功能的胶原蛋白结构。
Abstract:
Mammalian collagen has been widely used as a biomedical material. Nevertheless, there are still concerns about the variability between preparations, particularly with the possibility that the products may transmit animal-based …
>>>
Mammalian collagen has been widely used as a biomedical material. Nevertheless, there are still concerns about the variability between preparations, particularly with the possibility that the products may transmit animal-based diseases. Many groups have examined the possible application of bioengineered mammalian collagens. However, translating laboratory studies into large-scale manufacturing has often proved difficult, although certain yeast and plant systems seem effective. Production of full-length mammalian collagens, with the required secondary modification to give proline hydroxylation, has proved difficult in E. coli. However, recently, a new group of collagens, which have the characteristic triple helical structure of collagen, has been identified in bacteria. These proteins are stable without the need for hydroxyproline and are able to be produced and purified from E. coli in high yield. Initial studies indicate that they would be suitable for biomedical applications.
<<<
翻译
12.
李翛然
(2023-11-28 20:26):
#paper doi:10.1016/j.heliyon.2023.e17575 AI in drug discovery and its clinical relevance 一篇综述,这篇文章是近年来我觉得还不错的从临床角度介绍了一下目前AI制药行业的发展,基本上涵盖了几种AI的功能目标。 虽然能看出来作者的AI药物设计水平不够深入,但是不方案从这件事情的本院入手,即FDA的评审通过及临床角度来进行评价。 所以作为一个入门的综述还是非常好的,大家对这个行业感兴趣都可以看一看
Abstract:
The COVID-19 pandemic has emphasized the need for novel drug discovery process. However, the journey from conceptualizing a drug to its eventual implementation in clinical settings is a long, complex, …
>>>
The COVID-19 pandemic has emphasized the need for novel drug discovery process. However, the journey from conceptualizing a drug to its eventual implementation in clinical settings is a long, complex, and expensive process, with many potential points of failure. Over the past decade, a vast growth in medical information has coincided with advances in computational hardware (cloud computing, GPUs, and TPUs) and the rise of deep learning. Medical data generated from large molecular screening profiles, personal health or pathology records, and public health organizations could benefit from analysis by Artificial Intelligence (AI) approaches to speed up and prevent failures in the drug discovery pipeline. We present applications of AI at various stages of drug discovery pipelines, including the inherently computational approaches of design and prediction of a drug's likely properties. Open-source databases and AI-based software tools that facilitate drug design are discussed along with their associated problems of molecule representation, data collection, complexity, labeling, and disparities among labels. How contemporary AI methods, such as graph neural networks, reinforcement learning, and generated models, along with structure-based methods, (i.e., molecular dynamics simulations and molecular docking) can contribute to drug discovery applications and analysis of drug responses is also explored. Finally, recent developments and investments in AI-based start-up companies for biotechnology, drug design and their current progress, hopes and promotions are discussed in this article.
<<<
翻译
13.
李翛然
(2023-10-31 13:21):
#paper doi:10.1093/bioinformatics/btad596 DeepCCI: a deep learning framework for identifying cell-cell interactions from single-cell RNA sequencing data
一个新的框架,在用scRNA的数据来解释细胞互作,不过我觉得最大的问题是,看了一下他的训练集和数据集,还是通过对于scRNA的初步处理数据,即做到uMAP的降维分类后就来训练,还是非常初级的想法,真正的细胞互作的机理在这个颗粒度下的解释会很糟糕。不过也算是一个跨领域的应用 值得鼓励
Abstract:
MOTIVATION: Cell-cell interactions (CCIs) play critical roles in many biological processes such as cellular differentiation, tissue homeostasis, and immune response. With the rapid development of high throughput single-cell RNA sequencing …
>>>
MOTIVATION: Cell-cell interactions (CCIs) play critical roles in many biological processes such as cellular differentiation, tissue homeostasis, and immune response. With the rapid development of high throughput single-cell RNA sequencing (scRNA-seq) technologies, it is of high importance to identify CCIs from the ever-increasing scRNA-seq data. However, limited by the algorithmic constraints, current computational methods based on statistical strategies ignore some key latent information contained in scRNA-seq data with high sparsity and heterogeneity.RESULTS: Here, we developed a deep learning framework named DeepCCI to identify meaningful CCIs from scRNA-seq data. Applications of DeepCCI to a wide range of publicly available datasets from diverse technologies and platforms demonstrate its ability to predict significant CCIs accurately and effectively. Powered by the flexible and easy-to-use software, DeepCCI can provide the one-stop solution to discover meaningful intercellular interactions and build CCI networks from scRNA-seq data.AVAILABILITY AND IMPLEMENTATION: The source code of DeepCCI is available online at https://github.com/JiangBioLab/DeepCCI.
<<<
翻译
14.
李翛然
(2023-09-26 17:25):
#paper Cell2Sentence: Teaching Large Language Models the Language of Biology doi:10.1101/2023.09.11.557287;
该论文提出了一种称为Cell2Sentence的新方法,以便使大规模语言模型能够在单细胞转录组数据上进行训练。 该方法将基因表达配置文件表示为文本序列,作者称之为“细胞句子”。 这些细胞句子由基因名称组成,这些基因名称根据表达水平排序,从而创造了一个稳健且可逆的生物数据编码。作者的研究表明,细胞句子以语言模型易于理解的格式正确编码了基因表达数据。 在这些细胞句子上微调的语言模型不仅稳健收敛,而且与从零开始训练的模型或其他专门用于处理单细胞RNA测序数据的前沿深度学习模型相比,在与细胞句子相关的任务上的表现显著提高。 细胞句子可以与文本注释无缝集成,以执行生成和总结任务,这两种任务都从自然语言预训练中受益。 事实上,在使用Cell2Sentence生成的细胞句子上应用任何基于文本的体系结构没有理论限制。 作者的发现强调了迁移学习在这一交叉学科设置中的好处。
总之,该方法提供了一种简单、可适应的框架,利用现有的语言模型和库将自然语言和转录组学相结合。 作者证明了语言模型可以被进一步微调以生成和理解转录组学数据,同时保留其生成文本的能力。这为分析、解释和生成单细胞RNA测序数据开辟了新的途径。
关键贡献包括:
引入Cell2Sentence,一种有效的方法,可以将单细胞数据表示为文本序列。
证明了大规模语言模型可以在细胞句子上进行微调,以生成准确的细胞类型并理解转录组数据,从而预测细胞标签。
提供了一个简单且模块化的框架,利用流行的LLM库将LLM适配到转录组学。
Cell2Sentence模型的关键思想是将单个细胞的基因表达谱转换成基因名称的文本序列,这些基因名称按表达水平排序。具体来说:对单细胞RNA测序数据进行标准预处理,包括过滤低质量细胞,归一化计数矩阵等。对每个细胞的基因表达式进行排序,排序根据每个基因的表达量从高到低进行。将排序后的基因名称序列作为该细胞对应的文本,称为“细胞句子”。可以在细胞句子中加入元数据,如细胞类型等 biological annotations。现有的预训练语言模型可以在这些细胞句子上进一步微调,学习细胞句子的分布。微调后的模型可以用于下游任务,如根据细胞类型提示生成细胞句子,或者根据细胞句子预测细胞类型等。生成的细胞句子可以转换回基因表达空间,用于后续分析。整个框架提供了一种直接运用现有语言模型处理转录组学数据的灵活方法。
Cell2Sentence的关键创新在于提出了一种可逆的细胞表达至文本序列的转换,将单细胞数据表示成语言模型可以处理的格式。研究表明,该转换可以高效地在两个模态之间传递信息,为应用自然语言模型提供了可能。
这是我看到的第一个大模型的方法在基因和单细胞分析上,一看就是一个学生作品,比如关于转录中,上下游的调控,和基因的异质性的问题都没有考虑。
不过把,我倒觉得是个进步,随着AI的深度介入,如果真的在 DNA-RNA-蛋白质建立起来了一个庞大的对应关系库。 那么人类的再生医学会有质的飞跃,而且我觉得这个时间不会太久。
Sciety,
2023.
DOI: 10.1101/2023.09.11.557287
Abstract:
AbstractLarge language models like GPT have shown impressive performance on natural language tasks. Here, we present a novel method to directly adapt these pretrained models to a biological context, specifically …
>>>
AbstractLarge language models like GPT have shown impressive performance on natural language tasks. Here, we present a novel method to directly adapt these pretrained models to a biological context, specifically single-cell transcriptomics, by representing gene expression data as text. Our Cell2Sentence approach converts each cell’s gene expression profile into a sequence of gene names ordered by expression level. We show that these gene sequences, which we term “cell sentences”, can be used to fine-tune causal language models like GPT-2. Critically, we find that natural language pretraining boosts model performance on cell sentence tasks. When fine-tuned on cell sentences, GPT-2 generates biologically valid cells when prompted with a cell type. Conversely, it can also accurately predict cell type labels when prompted with cell sentences. This demonstrates that language models fine-tuned using Cell2Sentence can gain a biological understanding of single-cell data, while retaining their ability to generate text. Our approach provides a simple, adaptable framework to combine natural language and transcriptomics using existing models and libraries. Our code is available at:https://github.com/vandijklab/cell2sentence-ft.
<<<
翻译
15.
李翛然
(2023-08-28 23:16):
#paper doi:10.1101/2023.07.27.550799v3.full.pdf
Context-Dependent Design of Induced-fit Enzymes using Deep Learning Generates Well Expressed,
Thermally Stable and Active Enzymes 这篇文章提出了一种新的酶设计方法:
提出了一种新的酶设计策略CoSaNN,利用深度学习结构预测模型AlphaFold来生成新酶的构象。这种方法考虑了氨基酸序列段落在不同构象环境下的折叠方式,可以更准确地预测嵌合序列的构象。
在序列优化设计阶段,该方法没有仅仅依赖RosettaDesign,而是同时采用了基于图神经网络的ProteinMPNN模型。 ProteinMPNN可以学习序列与构象之间的高阶非线性关系,生成更可折叠的序列。
额外训练了一个预测可溶性表达的图神经网络分类器SolvIT,作为酶设计流程中的另一层优化,提高高表达酶的生成概率。
在ROK糖激酶家族中,利用该方法成功设计了活性高、热稳定性强、高表达的新酶。一些设计的酶表现出比模板酶更好的催化特性。
该方法证明了深度学习模型可以捕捉复杂蛋白质的构象变化和序列关系,在保持催化活性和调节机制的同时实现大范围改造酶的结构。这为定向生物技术应用开辟了新的途径。
但是: 这个文章并没有提到,在activate site 不明确的情况下该如何设计酶,也就是说还是需要生物上探明了 activate site 再进行新的酶定向进化。
2023.
DOI: 10.1101/2023.07.27.550799
Abstract:
AbstractThe potential of engineered enzymes in practical applications is often constrained by limitations in their expression levels, thermal stability, and the diversity and magnitude of catalytic activities.De-novoenzyme design, though exciting, …
>>>
AbstractThe potential of engineered enzymes in practical applications is often constrained by limitations in their expression levels, thermal stability, and the diversity and magnitude of catalytic activities.De-novoenzyme design, though exciting, is challenged by the complex nature of enzymatic catalysis. An alternative promising approach involves expanding the capabilities of existing natural enzymes to enable functionality across new substrates and operational parameters. To this end we introduce CoSaNN (Conformation Sampling using Neural Network), a novel strategy for enzyme design that utilizes advances in deep learning for structure prediction and sequence optimization. By controlling enzyme conformations, we can expand the chemical space beyond the reach of simple mutagenesis. CoSaNN uses a context-dependent approach that accurately generates novel enzyme designs by considering non-linear relationships in both sequence and structure space. Additionally, we have further developed SolvIT, a graph neural network trained to predict protein solubility inE.Coli, as an additional optimization layer for producing highly expressed enzymes. Through this approach, we have engineered novel enzymes exhibiting superior expression levels, with 54% of our designs expressed in E.Coli, and increased thermal stability with more than 30% of our designs having a higher Tm than the template enzyme. Furthermore, our research underscores the transformative potential of AI in protein design, adeptly capturing high order interactions and preserving allosteric mechanisms in extensively modified enzymes. These advancements pave the way for the creation of diverse, functional, and robust enzymes, thereby opening new avenues for targeted biotechnological applications.
<<<
翻译
16.
李翛然
(2023-07-28 15:19):
#paper Role of neuroinflammation in neurodegeneration development doi:10.3892/mmr.2016.4948 这是一篇综述,陕西师范大学夏海滨/张伟锋教授团队,主要介绍炎症 神经退行病变之间的关联和联系,介绍了目前被生物信息及试验验证的一些总结,以及正在临床方面的进展。 总体来说,就是目前从国际上,越来越认可很多的严重疾病,比如神经退行病变,癌症的初始诱因都是炎症的持续发展,免疫系统的混乱以及细胞功能的变化。 从药物公司的角度来说,我有一个横向的很好的对比思路。 这个器官及细胞炎症的症候很能对应起中医的“湿气”,然而调控所谓的“湿气”其实就是把免疫系统调节正常。 免疫是一个直到目前为止,西方医学理论都没有完整的体系化解释清楚的一个大的系统性问题。 等待AI和更多的生物信息学介入,会让生物带来全新的革命。 至少这几年卖的最好的药 基本都是 单抗和细胞治疗,本质上都是免疫调节。
Abstract:
Neurodegeneration is a phenomenon that occurs in the central nervous system through the hallmarks associating the loss of neuronal structure and function. Neurodegeneration is observed after viral insult and mostly …
>>>
Neurodegeneration is a phenomenon that occurs in the central nervous system through the hallmarks associating the loss of neuronal structure and function. Neurodegeneration is observed after viral insult and mostly in various so-called 'neurodegenerative diseases', generally observed in the elderly, such as Alzheimer's disease, multiple sclerosis, Parkinson's disease and amyotrophic lateral sclerosis that negatively affect mental and physical functioning. Causative agents of neurodegeneration have yet to be identified. However, recent data have identified the inflammatory process as being closely linked with multiple neurodegenerative pathways, which are associated with depression, a consequence of neurodegenerative disease. Accordingly, pro‑inflammatory cytokines are important in the pathophysiology of depression and dementia. These data suggest that the role of neuroinflammation in neurodegeneration must be fully elucidated, since pro‑inflammatory agents, which are the causative effects of neuroinflammation, occur widely, particularly in the elderly in whom inflammatory mechanisms are linked to the pathogenesis of functional and mental impairments. In this review, we investigated the role played by the inflammatory process in neurodegenerative diseases.
<<<
翻译
17.
李翛然
(2023-06-27 14:00):
#paper Reduced hepatocyte mitophagy is an early feature of NAFLD pathogenesis and hastens the onset of steatosis, inflammation and fibrosis doi: 10.21203/rs.3.rs-2469234/v1. 这个文章很有意思。非酒精性脂肪性肝病(NAFLD)是一种全球流行性慢性疾病,是由肝细胞中肝脏脂肪过度堆积导致,因此也称为肝脏脂肪变性。临床范围包括,包括肝脂肪变性(NAFL,超过5%的肝细胞中脂肪堆积)、非酒精性脂肪性肝炎(NASH,其特征是存在肝细胞损伤、纤维化炎症等)、肝硬化和肝细胞癌。NAFLD与代谢综合征的特征密切相关,包括肥胖、胰岛素抵抗、高血糖、2型糖尿病和血脂异常。 现在基本上 35岁以上,有轻度的高血脂的人群或多或少都会出现脂肪肝。
传统的认为 脂肪肝都是吃的和代谢的问题,但是随着我们自己团队对于衰老模型的认识逐渐加深,以及各方的了解,其实非细菌性引起的各项炎症,也是引发癌症,NASH这种隐性疾病的元凶。 这一点和中医不谋而合,特别想中医上说的湿气重。 其实就是体内炎症无法清除干净的问题。
这两年真的是生物的好时候来了啊!生物信息学越来越多的应用,最终返璞归真,期待着采用天然产物来控制疾病到来的那一天。
Abstract:
Nonalcoholic fatty liver disease (NAFLD) encompasses a spectrum of pathologies that includes steatosis, steatohepatitis (NASH) and fibrosis and is strongly associated with insulin resistance and type 2 diabetes. Changes in …
>>>
Nonalcoholic fatty liver disease (NAFLD) encompasses a spectrum of pathologies that includes steatosis, steatohepatitis (NASH) and fibrosis and is strongly associated with insulin resistance and type 2 diabetes. Changes in mitochondrial function are implicated in the pathogenesis of NAFLD, particularly in the transition from steatosis to NASH. Mitophagy is a mitochondrial quality control mechanism that allows for the selective removal of damaged mitochondria from the cell via the autophagy pathway. While past work demonstrated a negative association between liver fat content and rates of mitophagy, when changes in mitophagy occur during the pathogenesis of NAFLD and whether such changes contribute to the primary endpoints associated with the disease are currently poorly defined. We therefore undertook the studies described here to establish when alterations in mitophagy occur during the pathogenesis of NAFLD, as well as to determine the effects of genetic inhibition of mitophagy via conditional deletion of a key mitophagy regulator, PARKIN, on the development of steatosis, insulin resistance, inflammation and fibrosis. We find that loss of mitophagy occurs early in the pathogenesis of NAFLD and that loss of PARKIN hastens the onset but not severity of key NAFLD disease features. These observations suggest that loss of mitochondrial quality control in response to nutritional stress may contribute to mitochondrial dysfunction and the pathogenesis of NAFLD.
<<<
翻译
18.
李翛然
(2023-05-29 22:06):
#paper doi:https://doi.org/10.1016/j.eng.2023.01.014
Artificial Intelligence in Pharmaceutical Sciences 这篇文章是国内几个大学发表在 engineer的综述文章,我的评价就是对于想进入AI制药领域的来说,是一个对于历史的很好总结,不过对于未来的展望,明显还是功力不足。 是一个纯外行的角度,在看制药行业的发展,明显没有深入制药领域并结合AI来进行分析。 这篇文章大家可以作为一个入门文章看一看。
Abstract:
Drug discovery and development affects various aspects of human health and dramatically impacts the pharmaceutical market. However, investments in a new drug often go unrewarded due to the long and …
>>>
Drug discovery and development affects various aspects of human health and dramatically impacts the pharmaceutical market. However, investments in a new drug often go unrewarded due to the long and complex process of drug research and development (R&D). With the advancement of experimental technology and computer hardware, artificial intelligence (AI) has recently emerged as a leading tool in analyzing abundant and high-dimensional data. Explosive growth in the size of biomedical data provides advantages in applying AI in all stages of drug R&D. Driven by big data in biomedicine, AI has led to a revolution in drug R&D, due to its ability to discover new drugs more efficiently and at lower cost. This review begins with a brief overview of common AI models in the field of drug discovery; then, it summarizes and discusses in depth their specific applications in various stages of drug R&D, such as target discovery, drug discovery and design, preclinical research, automated drug synthesis, and influences in the pharmaceutical market. Finally, the major limitations of AI in drug R&D are fully discussed and possible solutions are proposed.
<<<
翻译
19.
李翛然
(2023-04-28 17:37):
#paper De novo design of protein interactions with lerned surface pingerprints doi: 10.1038/s41586-023-05993-x.
文章的主要思路是分为三个阶段:(1)使用MaSIF-site预测目标蛋白质表面上具有高结合倾向的埋藏界面位点;(2)使用MaSIF-seed基于表面指纹寻找互补的结构基元(结合种子),这些基元具有与目标位点相匹配的特征;(3)将结合种子移植到蛋白质骨架上,使用Rosetta优化设计界面,增加稳定性和额外的接触。
文章的主要结论是,作者利用这种表面为中心的方法成功地设计并实验验证了针对四种蛋白质靶标的从头结合剂:SARS-CoV-2刺突蛋白、PD-1、PD-L1和CTLA-4。其中一些设计经过实验优化,而另一些则完全在计算机上生成,达到了纳摩尔级别的亲和力。结构和突变分析显示预测非常准确。总体而言,作者的方法能够捕捉分子识别的物理和化学决定因素,为从头设计蛋白质相互作用以及更广泛地设计具有功能的人工蛋白质提供了一种方法.
以上是通过chat GPT总结的。 不过我读完的感受就是,我并不认为这篇文章的水平是 nature 正刊的水平, masif 的算法在蛋白质结构对比上确实有用,但是背后有个深层次的问题这篇文章没有谈到,即目前来说,对于已知蛋白设计一个有效的配体蛋白,算法已经比较丰富了。并且最近2年发的文章已经有很好的实验结果来验证。 但是对于结构全新,或者说没有任何可用配体的蛋白来说,这个挑战非常巨大,文章并没有提到这种问题出现后的解决思路,而且甚至算法的创新比不上前段时间的 baker 的 rf diffusion. 总之吧 现在真的是蓝海市场。 这个领域机会太多了
Abstract:
Physical interactions between proteins are essential for most biological processes governing life. However, the molecular determinants of such interactions have been challenging to understand, even as genomic, proteomic and structural …
>>>
Physical interactions between proteins are essential for most biological processes governing life. However, the molecular determinants of such interactions have been challenging to understand, even as genomic, proteomic and structural data increase. This knowledge gap has been a major obstacle for the comprehensive understanding of cellular protein-protein interaction networks and for the de novo design of protein binders that are crucial for synthetic biology and translational applications. Here we use a geometric deep-learning framework operating on protein surfaces that generates fingerprints to describe geometric and chemical features that are critical to drive protein-protein interactions. We hypothesized that these fingerprints capture the key aspects of molecular recognition that represent a new paradigm in the computational design of novel protein interactions. As a proof of principle, we computationally designed several de novo protein binders to engage four protein targets: SARS-CoV-2 spike, PD-1, PD-L1 and CTLA-4. Several designs were experimentally optimized, whereas others were generated purely in silico, reaching nanomolar affinity with structural and mutational characterization showing highly accurate predictions. Overall, our surface-centric approach captures the physical and chemical determinants of molecular recognition, enabling an approach for the de novo design of protein interactions and, more broadly, of artificial proteins with function.
<<<
翻译
20.
李翛然
(2023-03-28 21:41):
#paper A novel protein RASON encoded by a lncRNA controls oncogenic RAS signaling in KRAS mutant cancers doi: 10.1038/s41422-022-00726-7. Cell Research
. 这篇文章是我最近精读的一篇文章,作者我都认识,做的靶点恰恰是我们正在做的。所以聊了很多。 这是很有可能针对未来一个大癌症种类的核心解决方案。 只不过现在rason 的结构还没有解析出来。 我们看看今年怎么处理一下。
Abstract:
Mutations of the RAS oncogene are found in around 30% of all human cancers yet direct targeting of RAS is still considered clinically impractical except for the KRAS mutant. Here …
>>>
Mutations of the RAS oncogene are found in around 30% of all human cancers yet direct targeting of RAS is still considered clinically impractical except for the KRAS mutant. Here we report that RAS-ON (RASON), a novel protein encoded by the long intergenic non-protein coding RNA 00673 (LINC00673), is a positive regulator of oncogenic RAS signaling. RASON is aberrantly overexpressed in pancreatic ductal adenocarcinoma (PDAC) patients, and it promotes proliferation of human PDAC cell lines in vitro and tumor growth in vivo. CRISPR/Cas9-mediated knockout of Rason in mouse embryonic fibroblasts inhibits KRAS-mediated tumor transformation. Genetic deletion of Rason abolishes oncogenic KRAS-driven pancreatic and lung cancer tumorigenesis in LSL-Kras; Trp53 mice. Mechanistically, RASON directly binds to KRAS and inhibits both intrinsic and GTPase activating protein (GAP)-mediated GTP hydrolysis, thus sustaining KRAS in the GTP-bound hyperactive state. Therapeutically, deprivation of RASON sensitizes KRAS mutant pancreatic cancer cells and patient-derived organoids to EGFR inhibitors. Our findings identify RASON as a critical regulator of oncogenic KRAS signaling and a promising therapeutic target for KRAS mutant cancers.
<<<
翻译