颜林林
(2022-06-22 00:43):
#paper doi:10.1038/s41591-022-01768-5 Nature Medicine, 2022, Swarm learning for decentralized artificial intelligence in cancer histopathology. 前段时间刚在Nature上一篇文章(doi:10.1038/s41586-021-03583-3)读到Swarm learning(群体学习),该文提及一种在不违反隐私法规的前提下进行临床数据共享,从而帮助针对那些普遍存在异质性的疾病开展精准医学研究。本文则是针对肿瘤病理图像分析,也使用群体学习技术。病理图像分析,是典型的需要依赖大量高质量数据集的研究方向,群体学习正好使得合作单位可以共同训练AI模型,同时又避免数据传输和数据垄断。本文基于来自爱尔兰、德国和美国的三个结直肠癌患者队列训练了模型,该模型通过分析患者的H&E染色切片,预测其驱动基因突变、dMMR突变和微卫星不稳定性状态(MSI)等,并在来自英国的两个独立队列数据集中进行模型的性能验证。在训练模型的三个数据节点(研究中心)之间,并不直接传递原始数据,而是在每次迭代步骤中,通过去中心化的区块链技术,进行模型参数的同步。也因此,各数据节点之间是对等的,并没有需要汇总其他节点的特殊中心节点。这种模式为将来拓展到更大范围、更多机构的合作,提供了可能性,也将使病理图像分析模型得到更大进步。
IF:58.700Q1
Nature medicine,
2022-06.
DOI: 10.1038/s41591-022-01768-5
PMID: 35469069
PMCID:PMC9205774
Swarm learning for decentralized artificial intelligence in cancer histopathology
翻译
Abstract:
Artificial intelligence (AI) can predict the presence of molecular alterations directly from routine histopathology slides. However, training robust AI systems requires large datasets for which data collection faces practical, ethical and legal obstacles. These obstacles could be overcome with swarm learning (SL), in which partners jointly train AI models while avoiding data transfer and monopolistic data governance. Here, we demonstrate the successful use of SL in large, multicentric datasets of gigapixel histopathology images from over 5,000 patients. We show that AI models trained using SL can predict BRAF mutational status and microsatellite instability directly from hematoxylin and eosin (H&E)-stained pathology slides of colorectal cancer. We trained AI models on three patient cohorts from Northern Ireland, Germany and the United States, and validated the prediction performance in two independent datasets from the United Kingdom. Our data show that SL-trained AI models outperform most locally trained models, and perform on par with models that are trained on the merged datasets. In addition, we show that SL-based AI models are data efficient. In the future, SL can be used to train distributed AI models for any histopathology image analysis task, eliminating the need for data transfer.
翻译