Artificial intelligence (AI) adoption in the healthcare domain has been swift and comprehensive, including many recent efforts here at UHN, and is projected to increase rapidly. This is being driven by increases in data collection, as well as increased interest in personalized medicine in the face of a data deluge. In turn, this growth has been enabled by cheaper and cheaper computing power and the development of high-performance AI algorithms. The end result of this process to this point has been the incorporation of AI broadly throughout healthcare, with much greater incorporation on the horizon. Excellent in-depth reviews of these topics can be found here and here.

This large-scale adoption of AI in healthcare comes at a critical crossroads. The general trend of population aging means that it will be important to adopt precision medicine techniques to both improve patient outcomes and increase the capacity of health care systems. However, against this backdrop, we face a warming planet where one of the major contributors is the CO2 emitted from human activities – including the adoption of new technologies and the (often fossil-fuel derived) energy they require (climate change can also affect human health directly). It is my opinion that it is possible to simultaneously implement AI in healthcare on a large scale, and mitigate environmental impacts, if enough consideration is given to the principled and responsible application of AI and related technologies.

What is AI? (A very brief introduction)

AI encompasses a set of novel technologies that are poised to revolutionize healthcare. AI is often portrayed as a monolith, but that is far from the reality. What is frequently called “AI” encompasses systems based on machine learning techniques with associated data processing frameworks/pipelines. Machine learning (ML) methods can be seen as components of AI systems. The ML that underpins AI systems can come in the form of simpler, classical systems that are rooted in statistics, and modern systems that rely on deep learning to process truly massive amounts of data (the performance of classical ML methods often suffers with truly large datasets, which spurred the development of deep learning). Owing to its complexity, much of the data of interest in the modern healthcare setting – that is, imaging, as well as electronic health records/EHRs, and even speech and text data – are often handled in some way using deep learning-based methods. For the rest of this post, I’ll use the “AI” term as a stand-in for various systems that incorporate ML and/or deep learning techniques.

Environmental impacts of AI

Much recent focus has been placed on the potential environmental costs of widespread AI adoption, including in healthcare. See, for example, here and here. It is important to keep in mind that the advanced computations underlying ML models and AI systems can require large amount of energy which, if derived from fossil-fuel sources, can generate considerable CO2 emissions at scale. A study from 2019 by Strubell et al. summarized the energy requirements, and corresponding CO2 emissions, generated by training several well-known deep learning models, and it is clear that larger ones carry large potential impacts even from a single training round (note that training is often done once or intermittently – although see the next section for more detail). See Figure 1, below, for a summary specifically focusing on models developed on the transformer architecture (see Vaswani et al., 2017 for a description of the original transformer model architecture). These models tend to be trained on large datasets that consist of sequential data (i.e., text or audio).

Figure 1. A copy of Table 3 from Strubell et al. (2019) depicting the energy requirements (and costs in $USD) of training several well-known deep learning language models. TPU = tensor processing unit. PUE = Power Usage Effectiveness. Initialisms describing neural network language models: ELMo = Embeddings from Language Models. BERT = Bidirectional Encoding Representations from Transformers. NAS = Neural Architecture Search. GPT = Generative Pretrained Transformer. For details of the transformer architecture, see this very well-known visual walkthrough.

Note that there are ways to mitigate potentially negative environmental impacts of widespread AI adoption, too. Another paper by Strubell et al. (2020) summarized four ways to approach this, however I have highlighted two of them here.

Smart algorithm choice

Deep learning models can be trained once or trained and then retrained intermittently, which incurs a lot of short-term energy use. However, depending on the use case, algorithm choice can potentially consume a lot of energy, even from a model that was already trained. This amount of energy is dependent upon, among other things, the choice of algorithm underlying the model, and the amount of energy required to process a single example for, e.g., image analysis, can vary widely. Desislavov et al (2021) presented a detailed overview of the energy requirements of different computer vision (CV) models (, and Figure 9 in particular from their paper (labelled for this post as Figure 2, below) provides an excellent summary of the amount of energy required to process a single example (red circles and horizontal red dashed line added by me). “Processing a single example” through an already-trained model is what is meant by the phrase “forward pass” in Figure 2. Notably, it is possible with newer/better-designed algorithms to obtain better performance (“Top-1 Acc.” In the figure) for a lower energy cost to process a single example (log Joules, y-axis of the figure). This indicates that, with a bit of careful planning and design, it is possible to incorporate high-performance AI systems into workflows without incurring energy penalties.

Figure 2. Copy of Figure 9 from Desislavov et al., 2021. Red circles and red dashed horizontal line added for this blog post. Note that, for the same amount of energy, a newer AI model can be 20% (absolute) more accurate at a classification task than an older one given the same training data.

Smart power source choice

When speaking of energy usage, it is also important to acknowledge that not all sources are created equally – some rely largely on renewable and/or non-emitting sources, whereas others use fossil fuels/non-renewables. Figure 3 (generated for this article using the data available here) presents the CO­2 emissions produced per kilowatt-hour of energy consumed. It is clear to see that the range is approximately 50-fold between the best in the world (Ontario and Quebec are among these, thanks to our provices’ usage of hydroelectricity and nuclear energy) to the worst in the world (mostly coal-derived power). Therefore, data centres located in “green” areas would be favoured over ones located in “yellow” or “red” areas.

In fact, if we modify the table from Strubell et al. (above) to use figures specific to Ontario, we can see the positive effect that reducing the CO2 emissions at source can have. Ontario’s mix yields approx. 26g (0.026 kg) of CO2/kWh (2019 values). Multiplying the number of kWh used to train each model by the CO2 intensity of Ontario’s energy mix yields the estimate CO2 emissions if the same models and same procedures were followed using an Ontario-based energy source. These represent an emissions reduction of 94%.

ModelkWhCO2 lbs (Strubell)CO2 lbs (if in Ontario)
NAS656,347626,15537 569.3
Table 1. Emissions Strubell at el. Estimated CO2 emissions using a factor of 0.954 pounds (0.434 kg) CO2 per kwh generated.
Figure 3. Summary of CO2 emissions by location around the world.


To summarize, we now find ourselves at a critical intersection between differing needs: the need to care for an aging population with precision medicine techniques, and the need to reduce the impacts of human energy use on the global climate. To meet both of these goals, it will be important to build scalable and efficient AI-powered systems with greenness baked into them from the start. AI adoption in healthcare has the potential to dramatically improve healthcare systems and patients’ experience, as well as ease some burdens on healthcare staff. This will enable future generations of healthcare systems to continue to provide services widely and effectively without unnecessary environmental harm.

About the author: Dr. Leif Simmatis is a postdoctoral fellow in the Speech Production Lab at the University of Toronto and at KITE. His research focuses on using AI, statistics, and data science to develop digital health solutions.