张贝
(2022-04-30 20:39):
#paper DOI: 10.1038/s41580-021-00407-0 Nat Rev Mol Cell Biol.,2021,A guide to machine learning for biologists. 近几十年来,随着生物数据集规模与复杂性的大幅增长,机器学习越来越多的用于为潜在生物过程构建信息与预测模型。然而具体的机器学习方法多种多样,令人眼花缭乱。对于不同类型的生物数据,该如何选择特定的机器学习技术?本文是一篇2021年发表在Nature Reviews Molecular Cell Biology 上的综述文章,向读者简要介绍了一些关键的机器学习技术:既包括分类、回归、聚类模型等传统机器学习方法,也包括最近开发和广泛使用的涉及深度神经网络的技术。本文描述了不同的技术如何适用于特定类型的生物学数据,并指出着手进行涉及机器学习的实验时需要考虑的要点。最后,本文还讨论了一些机器学习研究的新方向。
A guide to machine learning for biologists
翻译
Abstract:
The expanding scale and inherent complexity of biological data have encouraged a growing use of machine learning in biology to build informative and predictive models of the underlying biological processes. All machine learning techniques fit models to data; however, the specific methods are quite varied and can at first glance seem bewildering. In this Review, we aim to provide readers with a gentle introduction to a few key machine learning techniques, including the most recently developed and widely used techniques involving deep neural networks. We describe how different techniques may be suited to specific types of biological data, and also discuss some best practices and points to consider when one is embarking on experiments involving machine learning. Some emerging directions in machine learning methodology are also discussed.
翻译