With the advancement in technology we have seen a never before explosion of data be it terms of its volume, velocity, variety, veracity or value (the 5 V’s). Research shows that the data generated since the year 2009 is way more than the data that was generated since the entire history of mankind. This data deluge presents a sea of opportunities which if leveraged well can prove game changers and help businesses run like never before. This is nothing but the Big data challenge which is a buzz words these days.
R which is a free software programming language and environment for statistical data analysis and graphics can be used to explore datasets and gain insights. Though I was initially skeptical about being able to comprehend R, I took a few tutorials on R and found it interesting and thought of sharing my learning experience. You can check http://www.r-project.org/ for detailed information on R.
I would like to explain a simple analysis-visualization example for a predefined dataset in R - UKLungDeaths {datasets} - Three time series giving the monthly deaths from bronchitis, emphysema and asthma in the UK, 1974–1979. You can check out the dataset at the following link: http://stat.ethz.ch/R-manual/R-devel/library/datasets/html/UKLungDeaths.html
As a prerequisite you would have to download and install R, and then follow the steps:
- Launch RGui, and load the dataset, in our case as it is a predefined dataset it is already loaded into memory.
- You can check what the dataset looks like by typing the following in the R console.
ldeaths # dataset of Monthly Deaths from Lung Diseases in the UK - both sexes
mdeaths # dataset of Monthly Deaths from Lung Diseases in the UK – Males
fdeaths # dataset of Monthly Deaths from Lung Diseases in the UK – Females

3. To plot the visualization, type the following:
par(mfrow=c(1,3)) #combine multiple plots into one overall graph
plot(ldeaths, xlab="Year", ylab="Both sexes", main="Monthly Deaths from Lung Diseases in the UK - both sexes") #plots line chart for monthly death for both sexes
plot(mdeaths, xlab="Year", ylab="Males", main="Monthly Deaths from Lung Diseases in the UK - Males") #plots line chart for monthly death for males
plot(fdeaths, xlab="Year", ylab="Females", main="Monthly Deaths from Lung Diseases in the UK - Females") #plots line chart for monthly death for females
The plot looks as shown below:

Thus you can see for yourself how simple it is to visualize a dataset in R. With above visualization you can draw various insights like:
- The highest and lowest monthly death was recorded in 1976 (total including both sexes)
- There has been constant fluctuation in the numbers over the months(in various years) but the overall behavior has been consistent
- The trend towards 1979 end shows the numbers to be the decreasing side
This is however a very simple example of data visualization using R.
The R language has a large number of packages which support various statistical methods and functions which can be used for complex scenarios and use-cases.
SAP Hana already has integration with R.
Hope this post encourages you to explore more on R and come up with interesting analysis and visualizations.
Thanks for reading!