Artificial intelligence (AI) is rapidly becoming ubiquitous, so much so it has been argued that “AI … is becoming an infrastructure that many services of today and tomorrow will depend upon.”25 Current progress in the field of AI is spearheaded by machine learning (ML) techniques such as deep learning, which has rendered many tasks previously thought to be out of reach of AI more or less solved. The past decades have seen an exponential rise in the amount of compute used by ML systems,29 which has led to a subsequent rise in energy consumption and carbon emissions.17,23,37 Beyond carbon emissions, increased production and use of the hardware infrastructure needed for ML is potentially exacerbating broader environmental impacts.15 While on the one hand ML systems can be used for making progress toward the sustainable development goals (SDGs),27,34 on the other hand the factors mentioned here limit the sustainability of ML from an environmental perspective.
A major focus of the ML community in pursuit of sustainable ML (more specifically improving the sustainability of ML34) has been to make ML systems and the hardware that runs them more efficient.5,23,37 Efficiency in this context is understood through the relationship between three factors: compute, generally measured in terms of floating point operations per-second (FLOPS), the number of parameters used by an ML system, and/or the amount of time needed to perform a particular computation; energy which is generally measured in terms of kilowatt hours (kWh) required to perform the compute; and carbon, generally measured in terms of equivalent grams of CO2 (gCO2eq) emitted due to the energy consumption.
The aim of ML efficiency is to reduce the costs (for example, energy or carbon) for a given unit of output (for example, compute). These efficiency improvements can reduce the carbon emissions of ML systems, and should be continued, but they can also fall short. This is evident when considering the overall goal of improving environmental sustainability of ML as improvements in efficiency often have unexpected effects.9,11,36 Additionally, efficiency primarily addresses operational emissions while exacerbating the relative impact of embodied emissions, and may be outpaced by the growing infrastructure needed to support ML as a technology.14,25,37
In this article, we present a critical perspective of environmentally sustainable ML that examines the relationship between the efficiency of ML systems and their overall environmental impact. We hope to comprehensively demonstrate, at multiple levels of granularity providing both technical and non-technical reasons, why efficiency alone is not enough to remedy the adverse environmental impacts of ML. We express this through three high-level discrepancies between the effect of efficiency on the environmental sustainability of ML when viewed narrowly and when considering the many variables with which it interacts:
Discrepancy 1: Compute efficiency ≠ energy efficiency ≠ carbon efficiency.
Discrepancy 2: Efficiency has unexpected effects on operational emissions across the ML model life cycle.
Discrepancy 3: Efficiency does not account for, and can potentially exacerbate, broader environmental impacts from hardware platforms.
Based on these discrepancies, we argue that to make ML more environmentally sustainable, it will be necessary to address the complexity resulting from the interaction of many factors that affect the sustainability of ML as a technology. Here, ML “as a technology” includes ML systems and the people who use them: computation, ML model life cycles, human behavior, the supply chain, economic forces, and more. Finally, we posit that systems thinking, which provides a lens and framework with which to deal with complexity, offers a potential path toward accomplishing the goal of making ML as a technology environmentally sustainable.
The Three Discrepancies
Here, we discuss the three discrepancies; for an extended discussion and list of references, please see the supplemental material.a
Discrepancy 1: Compute efficiency ≠ energy efficiency ≠ carbon efficiency. At face value it would appear that reducing compute would reduce energy consumption, which would in turn reduce carbon emissions. However, operational carbon emissions are a function of both energy and carbon intensity, which is dependent on time and location, and energy is a complex function of several factors which metrics of compute (for example, FLOPS, number of parameters, and runtime) do not fully capture. As such, savings made in the amount of compute used in a model based on these metrics do not always translate to savings in energy due to, for example, the specifics of model architecture and hardware.11 Furthermore, savings in energy consumption may not translate into savings in operational carbon emissions if one does not run their compute in locations and times where carbon intensity is low.2,8,11,b This discrepancy has been well documented in the literature, with multiple studies demonstrating and calling for a more holistic perspective on model efficiency.11,14,33
We present further evidence of the unintuitive effects of compute efficiency on operational emissions and energy in Figure 1 using statistics derived from 423k unique convolutional neural networks (CNN) architectures from Bakhtiarifard et al.3 Furthermore, looking at carbon intensity (Figure 1d), we see that each location has a vastly different average intensity, a large amount of variation, and several peaks as indicated by the number of outliers. Carbon intensity can change sporadically as a result of changing demand. ML jobs that could otherwise be run when carbon intensity is low have the potential to emit far more carbon than is necessary.8
The impact on operational emissions due to improvements in either compute or energy efficiency can often be different than expected. This is because variables such as runtime and number of parameters are not fully predictive of energy consumption, and energy consumption is not fully predictive of carbon emissions. This also reveals that one should take care to actually measure compute, energy, and operational emissions to observe the impact of actions intended to reduce those emissions, for example, using one of the many available carbon tracking tools.2,11
Discrepancy 2: Efficiency across the model life cycle. Discrepancy 1 described the complexity arising from factors that influence operational emissions at the level of compute. Developing, producing, and using an ML system in practice results in many actions which require compute and energy and emit carbon. Efficiency will impact the decisions one makes throughout the model life cycle, which will not always lead to reductions in carbon emissions. Here, we describe the unintuitive effects of efficiency on operational emissions when observed at the level of the model life cycle.
The model life cycle is generally broken down into two primary stages: development and deployment (see Figure 2). The split in compute, energy, and operational emissions between development and deployment depends on several factors. Many methods that are advertised as “efficient” are mainly applicable to only one part of the model life cycle as opposed to both, and may in fact incur a net increased cost in the end. Applying or abstaining from the use of efficient methods can thus potentially have a far-reaching impact on the total operational emissions of a model during its life cycle. For example, job scheduling allows one to reduce operational emissions during training by selecting to train one’s models at times and locations with lower carbon intensity.8 AI systems may offer the opportunity to decouple where a service is used and where most energy is consumed. However, job scheduling is not always a viable option, as the ability to select where and when to run may be limited due to constraints on how the trained models are used (for example, when deployment latency and on-demand use or privacy are of concern). How to holistically minimize operational emissions over the entire model life cycle as such is an open question, and addressing it requires being able to characterize the operational emissions resulting from multiple decisions over time. Furthermore, attempts to reduce operational emissions via efficiency may not succeed in practice, as theoretical reductions in operational emissions (for example, through the deployment of efficient models) can eventually result in greater emissions in practice. It is well documented that energy and carbon mitigation strategies are subject to rebound effects9,36 (a.k.a. Jevons paradox) which occur when the observed reduction in carbon emissions due to an improvement in efficiency is not as significant as the expected reduction, or could actually result in an increase in emissions. The rebound effect has been documented at multiple large companies with respect to energy consumption from ML systems.23,37 It occurs for a number of reasons, but is largely facilitated by economic, psychological, and behavioral factors which accompany efficiency improvements.
For example, an ML practitioner may experience rebound effects when improving a model’s compute efficiency, which enables training on a single GPU in less time. This efficiency gain allows for larger-scale experimentation, such as training longer, using more data, and exploring a broader range of hyperparameters. These extended activities can ultimately consume more energy and produce higher emissions than using the original, less efficient model with a limited search. Such behavior stems from perceived “attenuated consequences” of improving model efficiency.28 As such, operational emissions throughout the model life cycle can be particularly difficult to predict, as they are largely driven by behavior stemming from both a lack of awareness and competing incentives.36 More concretely, this can be a lack of awareness of what aspects of the model life cycle a particular efficiency improvement is targeting, behavior which leads to significantly more compute over time,28 incentives to scale up in order to improve accuracy and serve a larger user base, and more. The net effect is that improved efficiency does not mean that operational emissions across the life cycle will reduce, in some cases it can lead to further increases. Thus, in addition to the factors we discussed previously at the level of compute, we must reckon with different factors at the level of model life cycles which affect operational emissions in order to move toward the goal of reducing them. This calls for both technical and non-technical (for example, regulatory) solutions.
Discrepancy 3: Efficiency and platforms. Computing platforms (the hardware and infrastructure on which ML compute runs), come with their own set of environmental impacts including but not limited to carbon emissions. These impacts are diverse and highly distributed among many processes and people, and have the potential to worsen going forward as ML becomes more widely adopted.25 Efficiency can have both positive and negative impacts on this; on the one hand reducing the compute and energy needs of hardware and on the other hand facilitating the greater use and manufacture of existing and emerging hardware platforms.14,25 In light of this, it is becoming increasingly important to account for the environmental impacts of ML platforms and the factors which give rise to them.
Manufacturing the devices on which ML systems operate requires the mining of different materials (for example, critical minerals), yielding multiple pollutants and hazardous products such as radioactive and toxic chemical components.4 Poor mining practices can lead such chemicals to enter food and water supplies and cause downstream health impacts.22 The mining of resources such as gold, nickel, copper, and other critical minerals additionally contribute significantly to deforestation, threaten to worsen the effects of climate change, impact biodiversity and critical ecosystems such as those in the Amazon, and harm Indigenous communities. It is currently unclear what the contribution of ML systems is to these impacts as data describing them is lacking, but they are known to be significant in the ICT sector as a whole.25
Additionally, the mining and device manufacturing process result in their own carbon emissions (a.k.a embodied emissions). These embodied emissions can vary greatly, where it has been estimated that they account for approximately 10% of total emissions in data centers and 40%–80% of total emissions for devices at the edge such as mobile phones and sensors which collect data. A significant portion of a model’s total carbon footprint can come from embodied emissions. For example, Luccioni et al.18 estimate that the embodied emissions from training BLOOM, a 176B parameter large language model (LLM), constituted 22% of its total emissions (11.2 tons CO2eq).
Furthermore, much of ML compute, particularly with the emerging large deep learning models, is performed in data centers. Datacenters require a significant amount of water for electricity generation and cooling; ML systems are playing an increasingly large role in this water consumption.15 For example, Li et al.15 estimated the water consumption from GPT-3, another LLM with 175B parameters, required 700,000 liters of clean fresh water to train.
Finally, at their end of life, devices will be either recycled, repurposed, or disposed of, where repurposing and disposal result in e-waste.35 Environmental impacts from this relate to the physical dumping of e-waste on land. With so much waste, hazardous chemicals can leak into the land and water supplies22 and affects local biodiversity. Similar to the impacts of mining, the contribution of ML to the impacts from e-waste are not well understood.
At the level of data centers, efficiency has helped to limit energy consumption rising at the same pace as compute loads in recent years.20 Typical server refresh times, where devices reach end of life (e-waste) and new devices are purchased and installed (resulting in embodied emissions and all of the impacts from device manufacturing), appear to be slowing, potentially with the help of increased device energy efficiency.6 However, device energy efficiency is also slowing, in line with the slowing of Moore’s law,30 so it is not clear if this trend will continue. As with efficiency across the life cycle, efficiency at the level of hardware could potentially result in rebound effects as hardware becomes cheaper, leading to increased demand.10 Indeed there has been increasing demand for ML hardware in recent years despite improvements in efficiency7,12 which is likely to continue going forward. This is particularly the case for edge devices, as improvements in compute and energy efficiency enable more ML compute to be performed outside of large datacenters. This has the potential to facilitate rebound effects in their operational energy consumption and carbon emissions as a result of their increased efficiency.37 Additionally, the broader environmental impacts of device manufacture will potentially worsen if not accounted for and mitigated.
Given this, efficiency at the level of platforms is limited by both the slowing of hardware energy efficiency30 as well as behavioral limits with the rebound effect.10 Worse, even accounting for the environmental impacts of platforms as a result of ML is currently difficult due to the complexity of factors which contribute to them and/or a lack of transparency.15,19 As such, platforms add a significant amount of complexity to the problem of making ML environmentally sustainable. Addressing this, as well as the impacts from compute across the model life cycle, will benefit from understanding and managing this complexity. In this light, efficiency is only a partial solution.
Beyond Efficiency: Systems Thinking
While we are critical of efficiency throughout this perspective, we note that it is still important as it can help eliminate the environmental impact of ML systems. Thus, we encourage the community to foster a more honest and realistic discourse around efficiency in ML by being precise about what is efficient when describing “efficiency” and being wary of conflating efficiency with environmental sustainability as a whole. The discrepancies described in this perspective are intended to elucidate why efficiency is not enough to achieve the goal of making ML as a technology environmentally sustainable.
We see efficiency as one aspect to improve the environmental sustainability of ML that interacts with several variables at multiple levels. This complexity leads to other systemic issues beyond the unintuitive effects of efficiency. For example, depending on what factors are chosen to be measured and how values such as the efficiency of data centers, embodied emissions, and carbon intensity are determined, one can conclude either that the carbon footprint of ML training will plateau and shrink23 or that the observed exponential increase in the carbon footprint of ML training17 will continue in the near future. These issues persist at the level of individual models, exemplified in the difference in reported carbon emissions of Evolved Transformer31 by Strubell et al.32 and Patterson et al.23
The goal of the paper from Strubell et al. was to characterize the carbon emissions of modern ML circa 2019; as one component of this, they were forced to estimate some quantities needed to calculate the emissions of the model selection stage for Evolved Transformer (due to lack of transparency and reporting of these emissions in the Evolved Transformer paper), including variables related to the compute itself and variables related to the infrastructure used to run the compute. Three years later, Patterson et al. then argued that the previous estimate was approximately 88x too highc when considering the actual settings used for model selection. These differences arise from a lack of transparency of critical data (for example, embodied emissions) and misalignment between ideas of what factors in ML to consider when measuring environmental impacts. This, in addition to the discrepancies discussed previously, illuminates the need for a new way to approach the environmental sustainability of ML as a technology which is more holistic and effective.
One way is to adopt systems thinking.1,21 Systems thinking is a well established field of study24 which has been successfully applied in several areas including engineering, management, computer science, and sustainability.36 It seeks to understand the relationship between the structure and behavior of complex systems: “interconnected sets of elements which are coherently organized in a way that achieves something.”21 A key feature of systems thinking is the insight that complex systems are more than the sum of their parts. This is revealed through the systems lens, which looks at the behavior of the entire system as a whole, relating the components of the system to each other through causal feedback loops. This can reveal previously unobserved and unexpected behavior, meaning that the “something” which a system achieves might not be that which was intended by its designers.1 This contrasts with an approach that breaks a larger system down into more easily studied components, which obfuscates this behavior.23 Essentially, systems thinking is a conceptual shift from seeing how individual causes give rise to behavior (for example, a person reduces their carbon footprint by taking the bus instead of driving a car) to seeing how systems themselves behave (for example, carbon emissions are produced by the transportation system, in which people, buses, and cars are a part).
How can systems thinking bridge the gap between efficiency and the environmental sustainability of ML as a technology? Consider a standard practice in ML for improving model training and inference efficiency: using mixed precision, where the number of bits used in computations is dynamically adjusted. Use of mixed precision computations should reduce the energy consumption of an ML model and thus operational carbon emissions when observed in isolation.d Just the use of mixed precision is a sufficient condition for achieving “efficiency.” However, systems thinking invites us to observe and understand the behavior which arises through the systems lens, and an action such as using mixed precision interacts with many variables affecting ML environmental sustainability, thus producing potentially unintuitive effects on variables such as carbon emissions.
One can consider how reducing the bit precision of a model interacts with, for example, its speed, which can in turn influence how much experimentation one chooses to perform in order to find the best model, facilitating the rebound effect (discrepancy 2). Going further, one can account for changes in the model’s accuracy, which, combined with speed, can influence how frequently that system can be expected to be used, thus affecting operational emissions over time. One can then determine how each of these factors will influence the amount of hardware infrastructure required to support the downstream use of that model, as well as the type of hardware (for example, edge devices vs. cloud data centers) likely to be used as a result of improved algorithmic efficiency (yielding discrepancy 3). Thus, systems thinking is intended to reveal how a seemingly isolated change such as using mixed precision inevitably “releases or suppresses a behavior that is latent within the structure” of the system itself,21 where the “system” in this case encapsulates ML compute, life cycles, and platforms.
Importantly, understanding such systems and their tendency towards particular behaviors can enable us to identify ways to both make the best use of the tools we have (for example, efficiency) and discover other effective leverage points (for example, socioeconomic regulation) to enact a desired change (for example, reduce carbon emissions). This is becoming more critical with ML as a technology in order to prevent undesirable systemic effects such as the “lock-in” of environmentally damaging behaviors.25 Greater measures than efficiency are needed in order to prevent this, and the time to start working on them is now.
Furthermore, systems thinking aims to understand the interconnections in a system “in such a way as to achieve a desired purpose.”1 Thus, systems thinking has the potential to help move toward a “desired purpose” such as aligning ML as a technology with the SDGs.14,34 This enables us to consider not just the environmental sustainability of ML, but also ML for environmental sustainability,27 the relationship of ML as a technology with economic and social sustainability, and how these areas are connected.
Given the complexity and cross-disciplinary nature of reaching a systems level understanding of ML as a technology and its impacts in practice, interdisciplinary collaboration is key. This has been done with initial work on identifying factors which affect ML sustainability holistically,14,16 developing governance frameworks,26 developing reporting frameworks,11,13 revealing the hidden costs of ML use,8,15 and more. A necessary step will be to foster more dialogue around these impacts: what impacts to measure, how to measure them, and what influences them. This can help us to model “the rules of the game,” that is, how these impacts arise as a result of system level behavior. Important questions then arise: what are effective interventions for changing the way ML as a technology operates for better? Who can and should be involved in implementing these interventions? What negative impacts do we want to limit and what positive impacts do we want to encourage from ML? A systems level understanding of ML as a technology offers a more informed way to explore these questions.
Conclusion
With respect to environmental sustainability, the ML community currently relies heavily on efficiency as the solution of choice.5,23,37 This is not without sensible motivation: efficiency can reduce carbon emissions, it is often easy to measure and implement, it lends itself as a metric by which one can compare different systems and methods, it can be deployed in many ways without requiring coordination between large groups of people, and it can help to serve other goals such as making ML systems faster and cheaper to operate. However, as ML systems are becoming increasingly prevalent,25 it is incumbent on us to move beyond the dominating focus on efficiency and to cultivate a more nuanced view of the environmental impact of ML as a technology and ways to reduce it. In this article, we demonstrated why this is the case by describing three discrepancies between efficiency and the goal of environmentally sustainable ML, and propose systems thinking as a way to move beyond efficiency. The discrepancies include: compute, energy, and carbon are not equivalent, operational emissions across the ML model life cycle are affected by efficiency in unexpected ways, and efficiency alone is not enough to address the broader environmental impact of platforms. We thus illuminate opportunities for new research, policy, and practice that can improve the environmental sustainability of ML as a technology holistically.
Join the Discussion (0)
Become a Member or Sign In to Post a Comment