小擎子
(2023-07-31 23:35):
#paper doi:10.1038/s41467-023-38347-2 Nat Commun, 2023, A general model to predict small molecule substrates of enzymes based on machine and deep learning, 基于机器学习和深度学习的酶小分子底物预测通用模型。可以预测任意1种酶和大约1400种小分子是否为其底物的通用模型。帮助减少确定特定酶的底物分子的筛选范围,帮助降低实验成本。缺点是目前有5%的假阳性率,因此比较适合预测单一酶的候选底物,不适合预测基因组规模代谢模型中的所有酶的候选底物。
A general model to predict small molecule substrates of enzymes based on machine and deep learning
翻译
Abstract:
For most proteins annotated as enzymes, it is unknown which primary and/or secondary reactions they catalyze. Experimental characterizations of potential substrates are time-consuming and costly. Machine learning predictions could provide an efficient alternative, but are hampered by a lack of information regarding enzyme non-substrates, as available training data comprises mainly positive examples. Here, we present ESP, a general machine-learning model for the prediction of enzyme-substrate pairs with an accuracy of over 91% on independent and diverse test data. ESP can be applied successfully across widely different enzymes and a broad range of metabolites included in the training data, outperforming models designed for individual, well-studied enzyme families. ESP represents enzymes through a modified transformer model, and is trained on data augmented with randomly sampled small molecules assigned as non-substrates. By facilitating easy in silico testing of potential substrates, the ESP web server may support both basic and applied science.
翻译