
In an organization using SAP HANA, the data resides in-memory for achieving massive performance. However, as the data grows, the amount of memory required to store the data also increases which in turn increases the cost as additional memory is required to cater for increasing data growth. Enterprises implementing SAP HANA should follow a data persistent strategy for building a smart storage infrastructure based on the business value of data thereby addressing the data storage requirements efficiently and at lower cost.
Making Storage Strategy Smarter
In SAP HANA, not all data is accessed frequently but it has to reside in-memory which increases the amount of main memory used. The historic or ‘cold data’ can be stored in separate data storage based on less expensive storage option. This data can still be accessed anytime providing necessary performance at lower cost. The end result will be a storage infrastructure that addresses the storage requirements of the business in a most efficient and cost effective solution.
Data can be classified into
When the historic or cold data is stored in separate data storage, the main memory storage is reduced and frees up the hardware resource and also makes the static data available. Access to this data requires faster reads but at less expensive cost. Maintaining all data including the infrequently accessed static data in a high-performance online environment can be very expensive or just impractical due to the limitations of the databases used in the data warehouse.
What Data needs to be persisted?
This is an important exercise that needs to be undertaken before we embark on any data warehouse project. With all the in memory solutions costing quite high, it is better to do an exercise to understand organization’s data requirements. Some of the pointer what data needs/needn’t to be persisted is given below.
Strong Information lifecycle Management covering above points is required to arrive at effective data persistence strategy for an organization. Some of the key benefits of successful data persistent strategy are –
1) Better resource usage – in terms of disk, CPU and memory
2) System availability
3) System performance
4) Analysis with right set of data
Option - 1
Implementing a near-line component makes it possible to keep less frequently accessed data, such as aged information or detailed transactions more cost-effectively. In addition, if the relatively static data can be removed from the data warehouse, it facilitates to perform regular maintenance activities more quickly and provide business users with higher data availability.
Option – 2
Apache Hadoop and Data warehouse
As the enterprises start analyzing larger amounts of data, migrating it over the network for analysis becomes unrealistic. Analyzing terabytes of data daily in-memory can bring down the processing capacity of the system and also occupies more main memory space. With Hadoop, data is loaded directly to low cost commodity servers just once, and only transferred to other systems when required.
Hadoop a true “active archive” since it not only stores and protects the data, but also enables users to quickly, easily and perpetually derive value from it.
Hadoop and the data warehouse can work together in a single information supply chain. The cold or the archived data can be stored in Hadoop and can act as online archives alternate to tapes. Used not only as storage mechanism, Hadoop also helps in real time data loading, parallel processing of complex data and discovering unknown relationships in the data.
What is Hadoop good at?
Since NLS has been discussed extensively in various forums and blogs , we shall discuss how Hadoop can be integrated with SAP HANA for effective data persistent strategy in subsequent discussion
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
User | Count |
---|---|
11 | |
10 | |
9 | |
7 | |
6 | |
5 | |
5 | |
5 | |
4 | |
4 |