November 24, 2014

Betting on Snow Predictions

Pretend someone is willing to bet you $50 that El Niño will not occur. Before you jump at it, you might want to know what the chances of El Niño are, right? So you then look up your favorite model prediction and discover there is a 90% chance of El Niño. The odds are in your favor. You go for it and take the bet.

But something happens and you lose. El Niño doesn’t occur. Oh the horror! Does that mean the model is totally useless? After all it forecasted a 90% percent chance of an El Niño and it didn’t happen. You might think that means the model was awful and next time you may not trust your money with such a prediction.

These sorts of bets occur all the time. And probabilistic forecasts (models that tell you there is a certain percentage (%) chance of an outcome) are becoming more popular. An example is Five Thirty Eight’s election and sports predictions.

But, despite their popularity, it is easy to misunderstand what these models are telling you. Normally, we like to think that a prediction is either “right” or “wrong.” However, there is value in such probabilistic models even when they appear to be flat-out “wrong.”

Let’s look more closely at your bet. There was a 90% chance of El Niño. That means there was a 10% chance of no El Niño. Phrased another way, it means that given similar starting conditions, 1 in 10 times El Niño won’t develop at all. Unluckily for you, the 1 non-El Niño time appeared on the very first try!

Maybe if you had hung in there and kept making bets, the next 9 times would all have been El Niños (1). You would have won $50 x 9 = $450! A nice chunk of change: that is why it makes sense to play a long game. But if you only played the game once, becoming discouraged in that first attempt, you would not have been able to take advantage of such a model because you might have thought the model was flat out “wrong.”

This can work the other way too. Say, in that very first turn, you won! There was a 9 in 10 chance of El Niño and then El Niño occurred. You assume this is a great model that is “right!” However, even that confidence is premature; in this one try, you just happened to be in that 9 in 10 moment. You very well could have been in the other category though it was less likely.

In fact, if the model is reliable 1 of the 10 times must result in no El Niño (we don’t know when it will occur). It might seem strange, but a model is slightly unreliable if it forecasts a 90% chance of a certain outcome occurring and that observed outcome actually happens 100% of the time (or any % other than 90%).

In a reliable model, the forecast probability should equal the historical probability over a long observational record

More simply: if you are flipping a coin, you would forecast that over a lot of coin flips, you would flip 50% heads and 50% tails. This is your prediction model: you expect the coin is equally weighted on each side and so there is a 1 in 2 chance of heads or tails. If the coin was heads in the very first coin flip, you don’t then assume your model is right or wrong. You assume that you haven’t flipped the coin enough to see whether your model is reliable. Over many flips (or long history), you should observe an outcome that is close to 50% heads and 50% tails.

Sometimes we show how good a probabilistic model is in a reliability diagram. We look at a longer period of time (30 years of past forecasts, or hindcasts) and ask how good the model was in predicting the chance of El Niño. Now keep in mind, as Tony has described previously, there are three possible outcomes in ENSO outlooks: a % chance of El Niño, La Niña, and Neutral. Here is a reliability diagram for 6-month predictions from the CFSv2 model showing the reliability of probabilities of El Niño (red line that is labeled “above”) and for La Niña (blue line that is labeled “below”):

The straight, black diagonal line shows the result for an ideal model with perfect reliability (the forecasted chance = historical chance in observations). What this figure shows us is that CFSv2 provides “overconfident” predictions 6 months out (2). That is, when it forecasts a 90% chance of El Niño, El Niño is actually observed about 60% of the time (red line). The models generally become more reliable (closer to the black diagonal line) for predictions made closer to the time the forecast is made (e.g. lead-0).

Probabilistic forecasts might seem wishy-washy at first glance. Isn’t it just a way to cover one’s backside if a prediction is “wrong?” However, a single forecast is not enough to tell you whether your prediction is “good” or “bad.” It’s over the long haul, when making bets on probabilities can pay off.

Author: Michelle L’Heureux