颜林林
(2023-11-25 09:20):
#paper doi:10.1016/S2589-7500(23)00219-4. The Lancet Digital Health. 2023, Operational greenhouse-gas emissions of deep learning in digital pathology: a modelling study. 这篇文章其实是针对数字病理学中大量应用深度学习技术的现状,对不同模型和分析策略的使用,进行能耗分析。而在能耗评估时,文章通过IP地址关联到相应电厂,结合该地区能源供应中可持续能源的占比等信息,换算成二氧化碳排放量,作为对环境可持续的影响评估。其研究结果显示,全球范围内深度学习在病理学中的广泛应用,可能导致大量的温室气体排放,最高可达16兆吨二氧化碳当量,需要大约86,590平方公里(0.22%)的森林来吸收这些排放,并由此呼吁大家关注相关问题。对于模型之间的比较,结果自然是使用升级后的模型或更简单的模型,可以降低能耗,从而有助于应对环境可持续问题。我个人对这种将能耗外推至碳排放的做法难以苟同,毕竟这中间的影响因素太多,将其过度简化的过程中,很容易人为操纵和引导结论,不过,这篇论文在能耗对比评估的部分还是有可取之处的。
Operational greenhouse-gas emissions of deep learning in digital pathology: a modelling study
翻译
Abstract:
BACKGROUND: Deep learning is a promising way to improve health care. Image-processing medical disciplines, such as pathology, are expected to be transformed by deep learning. The first clinically applicable deep-learning diagnostic support tools are already available in cancer pathology, and their number is increasing. However, data on the environmental sustainability of these tools are scarce. We aimed to conduct an environmental-sustainability analysis of a theoretical implementation of deep learning in patient-care pathology.METHODS: For this modelling study, we first assembled and calculated relevant data and parameters of a digital-pathology workflow. Data were breast and prostate specimens from the university clinic at the Institute of Pathology of the Rheinisch-Westfälische Technische Hochschule Aachen (Aachen, Germany), for which commercially available deep learning was already available. Only specimens collected between Jan 1 and Dec 31, 2019 were used, to omit potential biases due to the COVID-19 pandemic. Our final selection was based on 2 representative weeks outside holidays, covering different types of specimens. To calculate carbon dioxide (CO2) or CO2 equivalent (CO2 eq) emissions of deep learning in pathology, we gathered relevant data for exact numbers and sizes of whole-slide images (WSIs), which were generated by scanning histopathology samples of prostate and breast specimens. We also evaluated different data input scenarios (including all slide tiles, only tiles containing tissue, or only tiles containing regions of interest). To convert estimated energy consumption from kWh to CO2 eq, we used the internet protocol address of the computational server and the Electricity Maps database to obtain information on the sources of the local electricity grid (ie, renewable vs non-renewable), and estimated the number of trees and proportion of the local and world's forests needed to sequester the CO2 eq emissions. We calculated the computational requirements and CO2 eq emissions of 30 deep-learning models that varied in task and size. The first scenario represented the use of one commercially available deep-learning model for one task in one case (1-task), the second scenario considered two deep-learning models for two tasks per case (2-task), the third scenario represented a future, potentially automated workflow that could handle 7 tasks per case (7-task), and the fourth scenario represented the use of a single potential, large, computer-vision model that could conduct multiple tasks (multitask). We also compared the performance (ie, accuracy) and CO2 eq emissions of different deep-learning models for the classification of renal cell carcinoma on WSIs, also from Rheinisch-Westfälische Technische Hochschule Aachen. We also tested other approaches to reducing CO2 eq emissions, including model pruning and an alternative method for histopathology analysis (pathomics).FINDINGS: The pathology database contained 35 552 specimens (237 179 slides), 6420 of which were prostate specimens (10 115 slides) and 11 801 of which were breast specimens (19 763 slides). We selected and subsequently digitised 140 slides from eight breast-cancer cases and 223 slides from five prostate-cancer cases. Applying large deep-learning models on all WSI tiles of prostate and breast pathology cases would result in yearly CO2 eq emissions of 7·65 metric tons (t; 95% CI 7·62-7·68) with the use of a single deep-learning model per case; yearly CO2 eq emissions were up to 100·56 t (100·21-100·99) with the use of seven deep-learning models per case. CO2 eq emissions for different deep-learning model scenarios, data inputs, and deep-learning model sizes for all slides varied from 3·61 t (3·59-3·63) to 2795·30 t (1177·51-6482·13. For the estimated number of overall pathology cases worldwide, the yearly CO2 eq emissions varied, reaching up to 16 megatons (Mt) of CO2 eq, requiring up to 86 590 km2 (0·22%) of world forest to sequester the CO2 eq emissions. Use of the 7-task scenario and small deep-learning models on slides containing tissue only could substantially reduce CO2 eq emissions worldwide by up to 141 times (0·1 Mt, 95% CI 0·1-0·1). Considering the local environment in Aachen, Germany, the maximum CO2 eq emission from the use of deep learning in digital pathology only would require 32·8% (95% CI 13·8-76·6) of the local forest to sequester the CO2 eq emissions. A single pathomics run on a tissue could provide information that was comparable to or even better than the output of multitask deep-learning models, but with 147 times reduced CO2 eq emissions.INTERPRETATION: Our findings suggest that widespread use of deep learning in pathology might have considerable global-warming potential. The medical community, policy decision makers, and the public should be aware of this potential and encourage the use of CO2 eq emissions reduction strategies where possible.FUNDING: German Research Foundation, European Research Council, German Federal Ministry of Education and Research, Health, Economic Affairs and Climate Action, and the Innovation Fund of the Federal Joint Committee.
翻译