How the forecast is generated
GETTING THE DATA
The radar data comes from over 140 radar stations spread across globe. The raw radar data is downloaded in binary format. To give you a sense of what the quality of the data is, here is a color-coded map:
- Areas in black are covered by our data sources, but suffer from a lack of quality control: we cannot reliably check the results of our predictions in these areas, and so cannot verify their accuracy. (Expect them to be more accurate the closer they are to an area of a different color.)
- Areas in blue are covered by our data sources and are also quality controlled by checking them retroactively against weather observations. However, the data coverage is low-resolution, and so may not accurately convey local weather conditions.
- Areas in red are covered by high-resolution, quality-controlled data sources, which should provide accurate hourly forecasts down to the local level.
- Areas in green are covered by the hyperlocal prediction system, and can provide down-to-the-minute forecasts for an exact location. Notably, we can provide hyperlocal forecasts for the United States, United Kingdom, Ireland, and small parts of Canada. Hyperlocal coverage of Canada, Australia, and more of Europe are in the works. With so many data sources, we can check them against each other; where they agree, we can be more confident in our predictions.
CLEANING IT UP
Weather radar is noisy. There's lots of ground clutter, bug and bird migrations (!), and other artifacts that can be confused for precipitation. Here's an example:
The noise needs to be cleaned up. The noise mostly consists of low-intensity data -- the light blue areas. One technique for cleaning the noise is to simply remove all low-intensity data. That would clear away the noise and leave most of the actual precipitation data. However, for our purposes, "most" isn't good enough. Removing low-intensity data indiscriminately would remove valuable data from the leading and trailing edges of the storms. This data is crucially important to predict when it will start and stop raining. So the technique needs to be more sophisticated. The noise is removed with neural nets, using the Fast Artificial Neural Network C-library. The result is a small, blazing fast program that accurately identifies somewhere between 90% and 95% of the noise, with very few false positives. The end result looks something like this:
EXTRACTING STORM VELOCITY
Now that we have the radar data and we've cleaned it up, we come to velocity extraction. We use various computer vision algorithms to compare multiple radar image frames and create a map of velocity. Specifically, OpenCV, an open-source computer vision library, which comes with a number of optical flow and object tracking algorithms (for a great introduction to the topic and to see some sample code, check out this tutorial by David Stavens at the Stanford AI Labs). The end result looks something like this:
PREDICTION AND INTERPOLATION
Now that we've extracted storm velocity, it's time to use it to predict the future. This part of the system is proprietary and not available for public disclosure. It's a numerical and statistical calculation, rather than a meteorological one.
MONITORING ERROR
A prediction is worthless unless it is not only accurate, but reliably accurate. A large amount of our effort is focused on measuring the error rate of predictions. Some storms are more coherent and stable than others, so how far into the future we can project varies over time and at different geographical locations. Whenever we process a new radar image, we go back to previous images and project them forward, creating a map of what we think the storm will look like in the present. We then compare this with the latest radar image to see how close we got. We are constantly doing this check in real time, for every radar station. This lets us monitor our accuracy, and helps us quantify how effective future improvements are.
IN CONCLUSION...
The forecast consists of a number of different moving parts that all need to fit together to create accurate predictions. This documentation glosses over many of the details, but hopefully it helps get across the essential process and the approach to weather forecasting.