Barry Devlin of 9sight Consulting gave this BrightTalk webcast last week. With his permission (and BrightTalk), I share my notes below.
The abstract read:
“Business and IT are facing the challenge of getting real and urgent value from ever-expanding information sources. Building independent silos of big data analytics is no longer enough. True progress comes only by integrating data from traditional operational and informational sources with the new sources that are becoming available, whether from social media or interconnected machines.” You can watch the replay at https://www.brighttalk.com/webcast/9059/95949
Figure 1: Source: 9sight Consulting
Figure 1 shows how Big Data Analytics began – to understand and track what is going on
Now we have new information. This is simple, but new source of data
There are two pieces : basic BI and the other is operational BI real time insight into web site . An example is why a customer would abandon a cart in real time
What most likely buy if cross sell or up sell
Figure 2: Source: 9sight Consulting
Figure 2 shows how IoT drives huge quantities of data, with opportunities to re-invent business and create new businesses entirely
It extends the supply chain to the consumer – internet connected refrigerator – capable of monitoring goods inside, and can tell supplier what need to buy next
New business process is motor insurance, to spread the risk type of model with a sensor in car report on driving behavior
With health monitoring you bring people to hospital – but now you can monitor them at home to measure vital signs
This raises privacy and security issues, which is what big data does
Figure 3: Source: 9sight Consulting
You still need traditional business data
This includes the legal business of data that we did this business, we shipped this business, we invoiced you, time to pay
Big data is usually unreliable sources, unrelated
Figure 4: Source: 9sight Consulting
The picture on the right of Figure 4 is from a 1988 IBM Systems Journal with layered architecture
Tactical decision making was made based on reconciled data
This is now superceded by speed
Figure 5: Source: 9sight Consulting
Figure 5 shows new types of data: machine generated data, human sourced
On the horizontal line of the image on the right of Figure 6 shows the timeliness consistency
The vertical line shows structured content
Bottom left for machine generated data includes sensors, which is a major direction for IoT
On the top is human sourced information, including personal experience. This is subjective and reflects personal experience such as tweets to videos
Process mediated data has 2 arrows pointing to it. Before the internet we were capturing data from machines, and capturing from people. Machines that were inside our businesses – an ATM - machine, signals, and bring into process mediated form. The human source is clerk in bank, taking down information – customer, name, address – turned into process mediated data
When we put it through business processes is turns into process mediated data
Do we bring everything into process mediated data? This is not right answer
With modern machine data and human data, the characteristic is uncertainty
We need the ability to treat these three types of data differently
Figure 6: Source: 9sight Consulting
Figure 6 shows a modern IT environment, in a logical way
REAL means real extensible actionable labile (flexible)
Sources include measures, events, and messages, which instantiate create transactions and are less well managed and puts them into pillars
Figure 7: Source: 9sight Consulting
Figure 7 shows you can integrate sources in stores through this architecture
With operational processes, you create transactions part of the legal flow of the business
The box in middle of Figure 7 shows assimilation
These are pillars rather than layers
This is different than data lake or reservoir
Figure 8: Source: 9sight Consulting
Figure 8 shows the relational model as the core model
Columnar, compressed – ability to do different types of processing than row based
Figure 9: Source: 9sight Consulting
On this slide he mentioned there is a white paper on eBay.
eBay used a relational database to take machine generated data
Figure 10: Source: 9sight Consulting
Hadoop is key technology for handling human sourced data
Information is soft and lacking in known structures, large, ill defined
This is an evolving area, more diverse. It is largely batch; Hadoop 2.0 enables more real time
It is a programming environment – whereas is BI/EDW – declarative (descriptions) versus procedural approach of Hadoop
There is a push towards human sourced content
Figure 11: Source: 9sight Consulting
Three types of processing shown on Figure 11
It spans all of IT, not just IT
Instantiation turns measures events messages into instances
In the middle is assimilation: reconcile information before making it useful for business
Reification makes abstract real like data virtualization
Figure 12: Source: 9sight Consulting
Metadata is not data, it is information and much softer than data
It describes processes, people
For non-IT – it indistinguishable from information
NSA stole the word and made it big news
CSI (not the TV show) – provides information to bring across pillars to ensure talking the same language across the pillars; communicating instead of simply processing
MARS metadata error with programs in Orbiter – metric, on ground, imperial measures (page 76 of his book)
You still do up front modeling – for those that we understand and know and the rest is done with text mining
Figure 13: Source: 9sight Consulting
Figure 13 shows his book, which I do own but have not read it yet. The book came highly recommended to me.
Figure 14: 9sight Consulting
The conclusion is in Figure 14.
Plan now to get a head start on learning at 2014 ASUG Annual Conference with a pre-conference seminar: SAP BusinessObjects BI 4.1, SAP BW, and SAP BW on SAP HANA® – All in One Day. Join speaker Ingo Hilgefort as he shares his insight and provides a full day of hands-on training
A Verdict on Big Data MIT MOOC
Upcoming ASUG Big Data Webcasts:
Learn more about Big Data from customer sessions at ASUG Annual Conference:
Adobe Shares Managing Big Data Across a Logical Data Warehouse with SAP HANA, SAP IQ, and Hadoop
Norwegian Cruise Line’s Industry-Leading Blueprint to Leverage Big Data for Competitive Advantage
Also learn about "Big Data" or HANA from the following ASUG Pre-conference sessions:
Jump Start ASUG Annual Conference SAPPHIRE with a Pre-Conference Session - Back and Better than Ever
In-depth sessions on EIM at ASUG Annual Conference: Try a pre-con!
If you have a big data story to share, ASUG invites you to submit an abstract for SAP dcode for Las Vegas (aka SAP TechEd) - call for presentations is planned to start April 21st.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
7 | |
5 | |
5 | |
4 | |
4 | |
4 | |
4 | |
4 | |
3 | |
3 |