This is the fist part of a three-blog story. Part 2 is here. Part 3 is there.
I spent two years in Africa and was given the chance to learn how to play rugby in an “international team” – easy as it was the only one across the country :wink: .
Now, 2015 Rugby World Cup is coming in just 46 days!
I was wondering if we can find out the reasons why a player can become a rugby all-star during the World Cup.
From the site here, I extracted the statistics for all the players that ever participated to a world cup.
This represents 2440 players, 50 all-star players but also unsung heroes from Ivory Coast or Portugal.
The value of my target variable is 0 for a normal player, 1 for an all-star player.
My 50 all-star players originates from 9 countries, basically the Six Nations (Italy set apart) and the 4 nations.
(Apologies to my colleague pierpaolo.vezzosi for not shortlisting Italian players :???: )
This becomes a classification problem. I am going to see how the different input variables can help explain and predict the output variable. The full final dataset is attached to the post.
I open SAP Predictive Analytics 2.2, click on Modeler and Create a Classification/Regression Model.
I load the Excel file containing all the players and their stats.
I need to describe the data. Here is the final screen, with all the information filled in (I attached the description file to the post).
I set the target variable and exclude the Player variable at it is not useful for the model.
I check the Compute Decision Tree check box.
The model overview gives me useful information:
There is one suspicious variable Matches Won – as this is alone a very strong predictor of the output. SAP Predictive Analytics is warning us about the strong correlation between the input and the output variable.
The more you and your team win World Cup matches, the higher the chances are that you become a rugby legend. Makes sense, right?
Let’s move to the variable contributions.
Some more facts:
I move to the Decision Tree and use the two more important variables in my model.
Here is my interpretation:
I will keep this first blog post short and leave the rest for the second part (teaser: it's all about predictions :wink: ).
Thanks for reading! Your comments are welcome!
Antoine
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
23 | |
11 | |
10 | |
9 | |
8 | |
6 | |
6 | |
5 | |
5 | |
5 |