Leading into the Super Bowl last week we posted a blog that analyzed both the Seattle Seahawks and Denver Broncos and predicted a 26-23 Denver win.
It was an epic fail.
As the Public Relations lead for analytics, we decided leading into the game it would be fun to reach out to a select few friendly media outlets with this prediction and go on record. Leading into the event, the coverage was great.
And then... the game happened.
Post-game, like the sports commentators who also called a similar score, we were left scratching our heads asking what happened? Something has to be wrong when even the puppies in the puppy bowl were wrong in calling for a Denver win. Here’s a snapshot of how the media called it:
Jeff Duncan, NOLA.com | The Times-Picayune -- Broncos, 24-23
Larry Holder, NOLA.com | The Times-Picayune -- Broncos, 26-23
Katherine Terrell, NOLA.com | The Times-Picayune -- Broncos, 31-24
Kent Babb, The Washington Post -- Broncos, 31-28
Will Brinson, CBSSports.com -- Broncos, 31-21
Bob Condotta, The Seattle Times -- Seahawks, 24-21
While we can leave the analysis of Peyton Manning and the Bronco's performance to the football experts, the more interesting question is: What are the limits of Predictive Technology?
Predictive analytics is a future-looking version of ‘traditional’ Business Intelligence software. As Business Intelligence progresses it has gone from being about what happened, to what’s happening to what’s going to happen. The “what’s going to happen” is where predictive analytics comes in. Analytics colleague, Chandran Saravana, blogged: “First of all, predictive analytics is less about predicting something; It is more about understanding the relationship between the variables and context of those variable in technical terms; In business terms how these variables translate to business events.”
Insurance companies who want to mitigate risk, Utilities forecasting power demands, using Talent Forecasting when recruiting, or the ability to tie location based services to predictive models (i.e.: Foursquare, Facebook, Yelp) are all examples of areas predictive analysis is action.
As far as the Super Bowl goes, the argument can be made that the software did exactly what it was supposed to do: analyze data and visualize it, a combination of SAP Solutions for Analytics and SAP Lumira. The visualizations were great, and it was hard to argue with the prediction. In a sport filled with prognosticators, taking the SAP call seemed a great choice.
The dirty secret? SAP does not have a time machine or a crystal ball. We can’t actually predict the future.
In our statement to the media, we very accurately summarized:
When analyzing and predicting the outcome of the Super Bowl, we took into account a number of variables. As we saw last night, it can be difficult to predict the human element of the game, that’s what makes the game exciting. We are predicting one isolated event with little back data on head to head match ups between the two teams – only two previous encounters with a similar team lineup – on a very big event day. The margin for error was very wide, and as such it’s very difficult to predict it with 100% certainty.
But let’s go a bit deeper, could we have done better? Are there limits to predictive analysis? Saravana continues on his blog:
When it comes to Sports, especially with respect to future events, there is huge gap in the historical data itself; As of today, many sports captures a hell of a lot of data, still there is a huge hole in the data collection; For example – we do not collect data about the mood of the individual player prior to the game, reaction of the player to particular event, mistakes, etc. When analyst or so called data scientists do analysis, it is all about the data in hand and analysis outcome purely based on the boundary of data he or she analyzed. When it comes sports prediction, you cannot just rely on data, it need to be combination of human intelligence + your data analysis; Even with that combination of human intelligence + data analysis, there is no guarantee the prediction will work, mainly because you do not get chance collect and analyze all possible data points (absolute); The 'absolute analysis' not possible ...