Inherit the Wind

Figuring out precisely how machine learning and artificial intelligence might improve upon classical numerical weather prediction models is still in its early stages. Credit: jkgeography.com

The phrase “free as the wind” may be accurate in the philosophical sense, but in the physical world, wind is not always free; it presents opportunities and costs. Accurately predicting the wind comes with real economic and humanitarian consequences, at both the macro- and micro-climate levels. Often, the causes and effects between large-scale and small-scale atmospheric phenomena are part of the same equation, literally and metaphorically.

While research into the efficacy of incorporating artificial intelligence (AI) and machine learning (ML) to weather is not new, artificial intelligence has yet to make significant inroads into operational forecast models. Yet its relative computational economy, the ability to run much larger datasets much faster than established forecast models, has led to both the meteorological and computer science communities preparing for an imminent era of significant intellectual crossover.

Leading-edge ML research in weather is combining training datasets based on historical re-analysis data of established models with new observations, where even data from projects that did not use ML might be of great benefit to future models—and perhaps vice versa.

In one set of those projects using traditional modeling, researchers from Colorado State University and the National Oceanic and Atmospheric Administration (NOAA) found that improvements in wind predictions for the High-Resolution Rapid Refresh (HRRR) model, one of the most widely used models in the U.S, would have resulted in a potential $384 -million increase in real income nationwide, had the updated HRRR models been in place during model updating periods. Accurate wind forecasting is crucial to efficient scheduling of utility resources; forecasting more available wind energy than actually occurs can lead to last-minute switches to more-expensive fossil fuel-based power generation. Conversely, forecasting too little wind can cause overscheduling of other sources of electricity during a time they could be rested.

These projects were called the Wind Forecast Improvement Project. The first phase, WFIP-1, was conducted from 2011-13, collecting data from the Great Plains and west Texas. The second phase, WFIP-2, ran from 2015-19 in the Columbia River basin in the Pacific Northwest states of Oregon and Washington. However, in order to reap the savings that were discovered, painstaking data collection and analysis of micro-climate factors needed to be factored in. The team working on WFIP-2, for example, discovered that accurately forecasting wind speed at wind turbine heights was complicated by “cold pools” of stagnant air in the lowest part of the atmosphere. Those pools had not been accurately predicted before the project, which discovered cloud cover had much influence over their creation.

Cold pools “are a problem for multiple reasons,” said Jim Wilczak, research meteorologist at NOAA’s physical sciences laboratory in Boulder, CO. Wilczak served as the NOAA technical manager for WFIP-1 and observational lead for WFIP-2. “For wind energy they’re a problem, because you get very little wind energy production. For air quality, if you’re in one of these valleys and you have people using fireplaces and cars putting out smoke and particulates, they get trapped and you get very polluted, dense, cold layers near the ground.”

The WFIP work may also represent a demarcation point between an era where established weather models, called numerical weather prediction (NWP) models, were predominantly analyzed and modified using classical computing techniques. The NWP models the WFIP researchers used contained data from observation of conditions that were then run through sophisticated physical equations, with no machine learning of elements such as prior wind measurements or cloud cover at those locations included. While operational results of ML-based weather predictions are still in their infancy, Wilczak said he does not expect the status quo to last long.

“No machine learning was really applied to WFIP-1 or WFIP-2. We were strictly trying to improve the actual NWP models,” he said. “But I think that it won’t be very long until machine learning models surpass the forecast skill of current NWP models.”

Figuring out precisely how machine learning and AI might improve upon classical NWP models is still in its early stages, but the technologies’ promise is quite evident to Wilczak and others.

For example, Wilczak said that data in one topography, such as the cold pools found in the Columbia River Gorge, had limited transferability to possible study of wind in another, such as hilltop wind farms in upstate New York.

However, future researchers (or operational meteorologists) using ML in similar topographies anywhere in the world might be able to make use of such raw data to work in tandem with reanalysis data.

“In that sense, field programs like the WFIPs could potentially be important for providing the kinds of measurements that can go into the ML algorithms to try to improve the forecasts,” Wilczak said.

Adding such data need not be computationally expensive; in a summary essay covering efforts to incorporate AI into the National Weather Service in the U.S. in the August 2023 Bulletin of the American Meteorological Society, Paul J. Roebber and Stephan Smith pointed to a study that employed a deep-learning model trained with reanalysis data that generated 85,800 reforecasts in a few hours on a single graphics processing unit.

Satellites and LLM-Inspired Research

The late summer and early autumn in the Northern Hemisphere are peak times for the formation of tropical storms and hurricanes that can turn harmlessly out to sea or make a disastrous landfall, destroying property and causing widespread injury and death. Hurricane Ian, a Category 4 hurricane with sustained winds of 150 miles per hour when it made landfall in Florida in September 2022, killed 160 people in Cuba and the U.S. and caused between $50 billion and $65 billion in damages. Two of the most widely used forecast models, the GFS and the Euro, disagreed as to where the storm would strike land, which might have made the death toll larger than expected.

One of the largest hurricanes of 2023, Hurricane Lee, had the potential to be equally catastrophic and hard to predict. As it churned in the southern Atlantic Ocean, possibly threatening a large area of the northeast coast of the U.S. and Canada, pioneers in AI weather research told Communications their work might make forecasting such storms easier.

“Absolutely, yes, our work could help in gathering data on hurricanes like Lee,” Xubin Zeng, director of the climate dynamics and hydrometeorology center at the University of Arizona, said of a recent study he and his research team conducted. They used infrared satellite image data to predict wind speeds at multiple heights using water vapor movement. The study’s lead author, research scientist Amir Ouyed, developed ML algorithms to process the images. Comparisons with measurements from weather balloons showed the wind retrievals derived from their method were within the error range of existing satellite wind products and outperformed them in vertical resolution.

“Part of the reason work like ours that could be valuable is that over the open ocean you don’t have any real observations of the wind,” Zeng said. “That’s why NOAA spends so much money every hurricane season to send out aircraft to make measurements.”

Christian Lessig, chief scientist of AtmoRep, an ambitious new multi-institutional AI-based weather forecasting effort in Europe, said such technology may indeed be a boon for severe weather predictions. “It may not be possible to exactly predict where the hurricane will go, but you can run better probabilistic forecasts. Because AI models are cheaper than conventional models, you can generate much larger ensembles and get a better probability of where the hurricane might go.”

Zeng and Lessig’s projects may represent the yin and yang of exploring AI in wind forecasting. Zeng’s team looked at relatively small images to derive computer vision-aided predictions of winds at variable heights within a given vertical stack, something that had been impractical to impossible with weather instruments available for operational forecasts. Thanks to the serendipitous extended lifetime of one NOAA satellite, which orbited in the same circle but 50 minutes apart from a newer one, Zeng and his team were able to use differentiated pixelated data to discern water vapor movements, which are affected by how winds move the vapor, and are not visible to the human eye.

AtmoRep, on the other hand, was inspired by the principles of large language models (LLMs). Its creators have the goal of training one generative neural network that represents and describes all atmospheric dynamics, including wind; its modular nature would allow researchers and forecasters to pull out various elements, including wind, precipitation, temperatures, and so on. The challenge of building such a model is daunting, however. It needs to deal with much more complex processes and datasets than natural language processing models; the model has 3.5 billion parameters.

AtmoRep is trained on the European Center for Medium-Range Weather Forecast (ECMWF) ERA5 re-analysis dataset, which includes hourly estimates of atmospheric parameters such as air temperature, air pressure, and wind speeds at different altitudes. It also includes surface parameters such as rainfall, soil moisture content, ocean-wave height, and sea-surface temperature, spanning the period from January 1940 to the present.

To mitigate biases inherent in the training data, AtmoRep also can use contemporary observational data, as Wilczak surmised; in their pre-print introduction to the project, last revised early in September, the AtmoRep team introduced RADKLIM precipitation radar data from Germany’s national weather service. The result showed the extent and shape of the precipitation fields forecast by AtmoRep are much closer to RADKLIM than the original ERA5 data.

In theory, Lessig said, research such as Zeng’s could be very useful to add to neural networks like AtmoRep.

“One of the strengths of machine learning is that it can combine different data sources into a coherent neural network model, like LLMs that are trained on text from very different sources or text-image models that combine text and images into an abstract internal representation,” he said. “One can thus envision that training on reanalysis and data like the ones obtained by Ouyed and his colleagues can be combined to enrich each other.”

Lessig said the relative coarseness of the Zeng team’s images—1 degree, or 100 kilometers—limited its usefulness in downscaling (a technique used to translate large-scale model data to smaller spatial scales, such as a single watershed). However, he noted, other satellite measurements have 1-5 km resolutions, and land-observing systems are working at about 10m resolutions. Lessig said he hopes the team’s intention to make the AtmoRep code open source might encourage others to contribute and obtain data and code in a true community model.

Zeng’s study and the WFIP projects may be exactly the sort of foundational blocks such a community can build itself around. Zeng said the results of his team’s work were a demonstration of the feasibility of a future satellite mission in which they hope to provide 10-km resolution, adding that the increased cross-pollination of meteorology and machine learning research is at an exciting juncture.

“My group is not the developer of the actual technology, we used it from the field of computer science,” Zeng said. “That’s how science works. We learn from each other.”

Gregory Goth is an Oakville, CT-based writer who specializes in science and technology.