About Author:
Name: Sudeepti Bandi
Role: Principal Consultant from Analytics Practice
Introduction:
Data science’, ‘data analyst’, ‘Chief data officer’...the biggest buzz words now for people in IT, data warehousing and reporting areas! Apart from creating promising career options, they also have created interest among people because of the way analytics space brings them close to the business strategy. These jobs challenge ones brain, create passion and give lot of satisfaction.
Having said that, I just wonder how wise it is for one person to do everything under the umbrella of data science. In the process of cracking the algorithms and using them in building models, this thought came to my mind, why not break ‘data scientist’ role into smaller, more sensible and simple roles?
OK what are the different steps involved in an end-to-end analytics project?
Data science - Life cycle of a project
There will be at least 7 steps in data science and some of them could repeat based on the need.
• Step 1 - Identify the business problem/value addition/question – this has to be the starting point, Step 2- Data availability - Have the structure of your data set defined – The real challenge starts here-
o Do we have the data?
o Do we have access to the required data?
• Step 3 -Getting Data – How to collect the data from different sources in the system
• Step 4 -Data preparation – Once you have data there will be lot of cleaning and preparation required, reduce/increase/combine/split the predictors, determine and eliminate outliers, populate missing values convert few categorical variables into numerical etc.
• Step 5 –Exploratory Data Analysis – In this step we do descriptive and diagnostic analysis of the existing data. We build multiple graphs that give us direction towards the next steps of predictive analytics. We might also consider clustering the data and checking patterns.
• Step 6- In this step, we would apply one or more algorithms/calculations to get the predictive model.
• Step 7 -Reporting and Visualization – Graphical representation of the insights/results wherever possible and make it easy for business to interpret.
• Step 8 -Interpreting results - working with business in decision making and appropriately implementing the decisions taken.
Steps 1 and 8 – Should be done by a Business Analyst
Steps 2, 3 and 4 – For Data experts/Data engineers
Steps 5 and 6 – Data Analyst
Step 7 – Reporting Expert
Business Analysts and Data analysts should have industry skills. So, instead of just a ‘Data analyst’ it is good to have a ‘Retail Data Analyst’ or ‘Data Analyst –Pharma’ and so on.
This way a data analyst can focus on mastering (using and customizing) a handful of algorithms that he/she needs for prediction/creating models useful for that industry.
Data engineers and Reporting experts should have good grip of tools and technologies that would enable their jobs.
If the data scientist role is simplified and broken down into other focused roles, one can chose the industry and the role based on ones interests and strengths. When I am analyzing data I don't want to worry about how my DB can connect to various data sources within the landscape or how to improve the report performance!