Is AI model 'GenCast' the future of weather forecasting?
Interview with
A new AI model named ‘GenCast’ has outperformed the best traditional medium-range weather forecast, and it is also able to better predict extreme weather. Unlike our existing weather forecasting systems, which operate on supercomputers running massive simulations of the atmosphere, burning in the process through megawatt hours of energy and undoubtedly contributing in the process to climate change, the new approach uses machine learning to spot patterns in historical weather data - basically, when conditions look like this, this is the outcome, to predict future weather patterns. And it does it with a fraction of the energy spend and in a fraction of the time. The findings have been published in the journal Nature, and Ilan Price and Remi Lam at Google DeepMind have been telling me all about their model…
Remi - The traditional way of making a weather forecast is to use a physical equation to describe how the weather in the atmosphere evolves over time. What this means is that you have to use a very large supercomputer. It's costly and time consuming. It's also error prone.
Chris - We know about the errors. Anyone who's been a victim of the weather forecast knows all about that!
Remi - What we've been trying to do at Google DeepMind is try to uncover the potential of using all of that historical weather data that we are sitting on to improve the weather forecast. By doing so, we believe we can make better weather forecasts and faster weather forecasts.
Chris - How do you do it, Ilan?
Ilan - We do it by training a machine learning model on four decades of historical weather data. The model learns weather patterns and weather dynamics directly by looking at that data, and that's what it uses to make predictions going forward.
Chris - When you say you train it, what is it actually looking at? What do you feed in? What's the input?
Ilan - What it's looking at are historical estimates of the state of the weather in the past, and basically, during training, the model is shown, okay, this is the weather state at time X, make a prediction for 12 hours time. Then it gets shown what it should have predicted and it learns from its mistakes by showing it many examples of this. What are the patterns that it should learn to pick up?
Chris - Did you focus on just one geography or were you feeding this global data?
Ilan - This is a global model. That's important because if you want to be able to predict the weather at medium range, so that's out to about 15 days in our model's case, you really need to be able to model the global atmospheric dynamics.
Chris - I was going to say, because obviously there's this old joke, isn't it there, the butterfly flaps its wings and then there's a hurricane on the other side of the Atlantic, but it really is all interconnected, isn't it? You've got to be able to consider everything. But that has previously been such an intractable problem because of scale that it hasn't been done.
Ilan - Absolutely. Chaos is the name of that phenomenon where very small things can have very large consequences. It's one of the reasons why the weather really is inherently uncertain. We actually know that we can't predict the weather exactly. One of the important features of our new model is that it's an ensemble forecast. We don't try and do the impossible and make exactly one prediction of what will happen. Instead, we make multiple predictions of what might happen and that gives us a sense of the range of different possible scenarios in the future. It lets us calculate, okay, how likely are some scenarios, how likely are other scenarios?
Chris - How much better is it, Ilan? If we compare what our weather forecasters thought was going to happen with what your model suggested was going to happen, how good is it?
Ilan - It's hard to put an exact number, one single number on how much better GenCast is because there are lots of different things that we would like from a weather model and the improvements are different on different tasks. But overall, on the headline metric, that is averaged over all times of all weather conditions over the year, for more than 97% of the evaluated targets GenCast is better, but it's also better on a lot of the specific things we care about. For example, we might care specifically about extreme weather. We can ask, how good is GenCast at predicting a once in a seven year high temperature in a given location. We evaluated that in the paper and we see GenCast improving at that. Similarly, we care about predicting the trajectories of tropical cyclones. These have devastating consequences and the more advanced warning we have the better. We were able to show that GenCast is giving us better predictions of the tracks of these storms. It's giving us about a 12 hour advantage in accuracy over state-of-the-art operational models at the moment.
Chris - One of the things that Remi said earlier was that, in order to do what we do at the moment, it takes a supercomputer to do the sorts of calculations and run these models that enable us to make the predictions we have. How much better in energy terms is doing it your way than running those supercomputers?
Ilan - I don't have a good answer to that in energy units, but I can give you a comparison that makes it quite apparent. In comparison to hours on a super computer, with tens or hundreds of thousands of processes, we're talking about making a 15 day prediction by GenCast in eight minutes, produced by only a single TPU chip. That's a chip just a bit bigger than a computer. There's really orders of magnitude difference in the amount of computation that it takes to generate a forecast with GenCast and machine learning models compared to these traditional physics based models.
Chris - What are the implications of that then, Remi, apart from the fact that you can argue we'll save a lot of energy because we won't have to run these supercomputers and we'll get the results, which potentially are more accurate, more quickly. Apart from those, what are the implications of this?
Remi - I think this is quite a pivotal point in the way we do weather forecasting. It's much faster to make predictions, and it doesn't require supercomputers. What it means to me is that it'll be more accessible to weather forecasting and to conduct research in weather forecasting. We're also making the model publicly available so people can do research on it. I think this is really going to accelerate the progress within weather forecasting, both because it makes the research accessible, doesn't require a supercomputer, but also provides a new way of improving the model, really pushing on the axis of the data rather than purely the compute axis.
Chris - Are there any risks though, Ilan, in the sense that with large language models, I know that's a different technology, one of the things to emerge that's caused some people some headaches has been things like confabulation where it just makes stuff up. Could this make up a hurricane that isn't going to happen and have everyone battening down the hatches unnecessarily? Or could it just miss a hurricane and say, well that's not a hurricane, that's going to be fine, and then we end up with people in danger because of it?
Ilan - It's a really great point and I think that there's a few important things to consider. The first is, of course, no model is perfect, no model is free from errors. As we've already discussed, that's also true of physics based models. But it does raise the point that it's really important for us to be doing rigorous and scientific evaluation of these models, both in peer reviewed research like we've done in Nature, and also that when these models are beginning to be incorporated into operational systems as they are and as we think they will be going forward given these GenCast results, that they be tested by weather forecasters, by meteorologists so that trust is built in these systems. The second really important aspect of that is the prospect of these kinds of mistakes even further highlights the importance of probabilistic models, right? We don't have to rely on a model either predicting that there's a cyclone or not. It's a question of, in how many simulations, in how many scenarios that were predicted by the model, in how many did this occur? It allows us to estimate the risk of these events and if a mistake is made in one of those predictions, then it'll only show up as a very low probability event and we don't have to worry as much.
Chris - Ilan, to finish, I've got a big birthday next year, I'm thinking of having a party. How's July looking?
Ilan - I'll have to come back to you. Go check the model.
Comments
Add a comment