The Energy Footprint of Humans and Large Language Models

Each time someone interacts with a large language model (LLM), there is an energy cost in running the model for inference. In addition, there is an energy cost in the preparation and training of the model before it was brought to production. It is relatively easy to look at these costs in terms of kilowatt-hours (kWh), but it is hard to compare that to the alternative use of human time. To write a text like this one, the process involves a human and a computer running a word processor. If an LLM is involved in the process to make it faster, are we using more energy or, on the contrary, saving resources? Next, we present some back-of-the-envelope calculations to help bring light to this issue. We start with the inference phase and later address the training phase.

Text Generation

Humans can only write a certain number of words per workday. While Ernest Hemingway wrote about 500 words per day, Stephen King wrote 2,000 words, and Michael Crichton a record 10,000 words [1]. Let us settle for 2,000 words, around 250 words per hour in an 8-hour workday.

Estimating the energy cost of human labor is less simple. We could look at the energy requirements of our bodies. Writing is a sedentary activity, and walking burns more calories than thinking and typing; we can generously assume the consumption of 100 calories per hour of writing [2], about 0.12 kWh.

If we consider the data on energy consumption per capita, the values are much more significant. In the U.S. it is close to 80,000 kWh per person per year (the range in Europe is from 25,000 to close to 100,000 in the north). Restricting the analysis to electricity, in the U.S. it is around 12,500 kWh per person per year [3]. Assuming an 8-hour workday and considering 260 workdays per year brings the annual energy cost of one person’s hour of daily work to around 6 kWh[a].

Now for the energy cost of running an LLM. We have set a target of 250 words in an hour. LLMs generate tokens, parts of words, so if we use the standard ratio (for English) of 0.75 words per token, our target for one hour of work is around 333 tokens. Measurements with Llama 65B reported around 4 Joules per output token [4]. This leads to 1,332 Joules for 333 tokens, about 0.00037 kWh. Even lower energy costs can be obtained by running locally smaller models (I was able to use a local Meta Llama 3 8B model to generate a 250-word essay in 20 seconds on an Apple M3, using less than 200 Joules).

Although these values are only approximations, the margin is very big. Considering the 0.00037 kWh of writing 250 words in 20 seconds, our body will use more than 300 times that amount and take one hour. This hour of work will probably be supported by several kWh of electric energy, 3 orders of magnitude more than the LLM. We can reasonably conclude that it can be very energy-effective to offload some parts of the writing tasks to LLMs and combine it with human steering and validation, and hopefully work fewer hours.

Energy apart, using LLMs during writing has ethical implications in authorship. Still, some tasks, like rephrasing, asking for guidelines or text compression look within acceptable ethical use, and can boost productivity. One should avoid the temptation to simply use it for more low-quality content. Here I quote François Chollet, creator of the Keras open source library and an AI researcher: “What’s holding back research isn’t a lack of verbose, low-signal, high-noise papers. Using LLMs to automatically generate 100x more of those will not accelerate science, it will slow it down.”

Model Training

The cost of training a big foundation model can be daunting. The electricity required to train GPT-3 was estimated at around 1,287,000 kWh [5]. Estimates for Llama 3 are a little above 500,000 kWh[b], a value that is in the ballpark of the energy use of a seven-hour flight of a big airliner. However, in contrast to a single long flight that is not reusable, a foundation model once trained will instantiate a set of weights that can be shared and reused in many different instances.

Humans also accrue training costs. If we grossly simplify it, in terms of electricity use, we can approach that value by considering a 20-year-old writer who, if raised in the U.S., probably used close to 250,000 kWh of electricity in his/her 20 years of life. In other countries, these costs would be considerably lower. A fully trained foundational model cannot replace a human (life, spirituality, and humanity are much more than writing 250 words per hour). Still, for simple text generation tasks, the LLM training energy cost can be compared to raising two humans in the U.S. and maybe a half-dozen in less-energy-intensive countries (like my own).

Unlike humans and flight trips, foundational LLMs are perfectly cloneable and can be a base for finetuning towards more specific tasks. This allows for quickly amortising the training costs they incur.

Closing Thoughts

It is a bit off-putting to make these direct comparisons between humans and the blind statistical machines that LLMs are. However, often the reports on the energy expenditure of these models do not consider how hugely inefficient humans can be when using energy in a developed society. Maybe we can save a bit of that energy by cleverly using all the tools at our disposal, including LLMs.

Acknowledgements

I want to thank Luís Cruz, from TU Delft, for his comments on improving this text, and thank Alex de Vries for an author copy of his paper.

Carlos Baquero is a professor in the Department of Informatics Engineering within the Faculty of Engineering at Portugal’s Porto University and is also affiliated with INESC TEC. His research is focused on distributed systems and algorithms.

References

[1] The novelry. Average Daily Word Count for Writers. (accessed May 2024)

[2] Robert H. Shmerling. The truth behind standing desks. Sep. 2016. Harvard Health Blog.

[3] Hannah Ritchie, Pablo Rosado, and Max Roser. Energy Production and Consumption. Our World in Data. Jan. 2024.

[4] Siddharth Samsi et al. From Words to Watts: Benchmarking the Energy Costs of Large Language Model Inference. Oct. 2023. arXiv.

[5] Alex de Vries. The growing energy footprint of artificial intelligence. Oct 2023. CellPress, Joule, Volume 7, Issue 10.

[a] Even if humans worked 24 hours every day of the year, this figure would still be 1.4 kWh in the US.

[b] These figures did not reflect the energy costs of prior experimentation before final training, or the energy cost embodied in the used hardware, but they should still indicate the order of magnitude.