Affordable data management solutions, particularly cloud based, are easy to cite. The problem is that they usually are not also "high-performance" solutions. SAP HANA has proven over it's 10 year life to be a super-fast, in-memory, columnar database that can handle transactional and analytical activities equally as well. However, that's not the point of this article. This article shows you how SAP can manage petabytes of highly accessible data, delivering it to any person, process, or application at unparalleled speed for a price that will leave you wondering why you didn't consider SAP for your data hosting needs.
Introducing SAP HANA Cloud Data Lake
Let's start with a common journey organizations embark upon...
A common "move our data to the cloud" scenario looks like this:
- replicate multiple business application source databases
- convert & store them in cheap cloud file-based storage
- convert data sets of interest back to structured format and move them into a cloud database-as-a-service to gain performance
- potentially restructure data further (perhaps de-normalize) for analytical purposes
- aggregate data to gain better performance
Here are a few problems with this approach:
- significant effort to go from structured to file-based back to structured (to host analytics) - business context is lost
- including the source systems there could be 4-8 copies of the data by the time it's consumed by an end-user
- multiple data stores introduces complexity to manage data (with different interfaces, security, communication protocols, modeling efforts, and so on)
- going from a finished, and likely aggregated report back to the detail level of data is hard to deliver
- latent; data is unlikely real-time or available for analysis at the moment it's available to the organization
SAP HANA Cloud with Data Lake:
- one copy of the data or no copies if federated
- eliminates the need for aggregations
- real-time, available for consumption as it's available to the organization
- fraction of the physical size of source(s) - columnar compression
- already in the right format to promote analytics
Many organizations, particularly their individual lines of business, know the business value of their data. This is an organizational silo though. A veritable competency center within the organization. What they often times don't know is the value of much of their data when joined to data outside of their competency. This is where data volumes explode and organizations rush to cheap cloud storage solutions. We covered that above.
Instead, why not leave, at a minimum the SAP data, in structured format. Then grab non-SAP data from wherever you are storing it - cloud/on-prem file-based or structured storage solutions - and join it to the SAP data. By the way, if you want to get the data from it's source system and don't want another copy just leave it there and federate on demand. If you're worried about performance then replicate it in real-time to SAP's affordable data storage solution - SAP HANA Cloud Data Lake.
A few words on data temperature:
SAP calls SAP HANA Cloud Data Lake "cold storage". This is an accurate term from an industry perspective if you want to look at file-based storage as "frozen storage". Why? Because SAP's Data Lake is highly accessible and can serve most organizational needs without changing it's location. At SAP "warm" or "hot" storage is used when the value of the data is qualified to a greater degree. This value qualification might mean data is needed faster, more frequently, and/or by more people/processes/applications. As these dynamics increase an organization leans on "warm" OR "hot" storage. "Cold, warm, hot" data storage is all within one solution at SAP, HANA Cloud.
The value and simplicity of ONE:
Please take special note of the operative word "OR". When referring to data temperature ("cold", "warm", or "hot") as it relates to data storage means we need one copy of the data. Data is stored in "cold" OR "warm" OR "hot". Again, ONE COPY! From the data modeling persona this is one table partitioned to store the data (cold/warm/hot) based on it's relative value. Where the data is stored is transparent for an end-user persona who might start with some aggregated multi-LOB dashboard that combines data across the enterprise, delivering it in sub-second response time (hot storage). As they drill into the data and look for additional insights and a decrease of speed, frequency, and/or volume of requests occurs data is found in the most appropriate temperature - but it's still fast. Here's an example:
The above is the same technology found in SAP HANA Cloud with the exception that HANA and IQ are fully integrated. Bottom line, start with an inexpensive, tremendously scalable, real-time, high-performance data management solution hosted in the cloud and avoid the many hoops the other guys have you jumping through to go from many source systems to consolidated business insights.
If you want to get an idea of how much this might cost before you start discussions with an SAP sales representative check out the
SAP HANA Cloud Pricing Estimator. Remember, choose a custom configuration where you start with cold storage and work your way into warm and hot as the need arises. If you want to really get a feel for what you can do with SAP HANA Cloud why not
trial it before you buy it.