Over the past few months, I've presented on this topic to many customers and colleagues. As there seems to be such a high demand, I've decided to convert the underlying slide presentation into two blogs, with HANA and BW 7.30 - Part 1 focusing on the motivation, scenarios and use cases while the second looks at the combination of HANA and BW from a technical angle. Before I continue with the second blog please note that the usual disclaimer applies.
Overview HANA and BW 7.30 - Part 1 | Overview Part 2 |
Let's first look back to 2003 when the in-memory efforts were started within BW and have led to BWA (aka HPBI, HPA, BIA) Pushing BI To A New Frontier. Understanding the evolution will make you understand the rationale also on future developments.
Figure 1 pictures a matrix listing various layers of BW in the rows and the recent releases in the columns. You can see that initial investments have been around those areas which were natural first targets, namely the SQL processing behind infocube based queries. With BWA 7.2 this could be enhanced to multiproviders and there is even de facto support of DSOs via the hybrid provider. See this document for more details on the BW 7.3 / BWA 7.2 combination. Looking beyond the latter, I will show some examples of the in-memory impact on the planning engine and the data warehousing layer below.
Figure 1: Evolving In-Memory Footprint in SAP BW
There is certain advantages of moving from a BW setup based on a classic RDBMS server complemented by BWA to a BW system sitting on top of a single HANA server that has the combined abilities. From a technical perspective, the single HANA instance removes the necessity to manage consistency across two servers (as with an RDBMS and a BWA). This is particularly interesting for planning (write-back) scenarios and simplifies matters a lot. From an admin perspective, it is more ambiguous: some BW customers like the separation that the RDBMS caters for the warehousing while the BWA for the querying load respectively. On the other hand, two servers, hardware installations, licenses need to be maintained.
Let's turn to a category of features that goes beyond the traditional outstanding query performance advantages of HANA or BWA respectively, namely the evolution of BW-IP into in-memory.
Figure 2 shows a "Hello World" example for a planning operation: a set of cells are displayed to an end user who decides to increase one of the values from 250 to 300. What happens now when this change is submitted to the server? Remember that planning is typically done on an aggregated granularity (here: countries and years) while the data is much more detailed - in the example of figure 2: countries break down to branches, years to weeks. This means that changing a single value in the UI frequently translates into a large number of changes on the data level. This is called disaggregation and constitutes a frequent operation in a planning context.
In the traditional approach - meaning that processing is mostly done in the application server - the delta of the change (here: increase by 50) is calculated, then that delta is broken down to the actual data granularity (here: 52 weeks in 2011 and 500 branche in Germany) resulting in a potentially large number of values (here: 52 * 500 = 26000) that are then sent over a network to the RDBMS to be saved.
In the in-memory based approach, processing is pushed down to HANA, i.e. close to the data, by turning around steps 2 and 3, i.e. only a single value (here: 50) is sent to the DB engine accompanied by the disaggregation instruction and its associated parameters. Performance gains in that approach originate in the following:
This example follows a very generic pattern that can be applied in many other areas too. Some colleagues use terms like data shipping(traditional approach) vs function shipping (in-memory approach).
Figure 2: Comparing the traditional vs the in-memory approach of a "Hello World" planning example.
The pattern shown in the in-memory planning example, namely (a) to avoid sending huge amounts of data between application and DB servers, and (b) to implement performanc-critical operations directly on the engine-based data structures, can be applied to the BW data store object (DSO) too. To that end, it makes sense to recall how a DSO works (conceptually) and where performance becomes critical. Figure 3 shows how a DSO works:
In a traditional RDBMS environment, querying and data activation are performance critical. Now and as we know, querying is a fundamental strength of HANA and is not of concern anymore. However, data activation needs to be carefully considered: reconciling the current and the future images typically translates into moving lots of data from the DB to an application server where the data is matched and the delta gets calculated. The data activation is a clear candidate to be pushed down into the HANA engine.
Figure 3: Traditional DSO
The DSO has been natively implemented in HANA. Figure 4 basically indicates that it has been implemented as a black box that behaves as the traditional DSO shown in figure 3 and described above. The queryingand delta read operations become logical views on top of the black box. The upload operation is straightforward. The data activationhas been natively implemented and shows significant performance gains. As indicated, the sources of those gains are less data traffic and implementing it natively.
Figure 4: In-Memory DSO
This concludes this second part. Hopefully, we will be able to write about and discuss more details in the course of the next months. However, the examples above should provide a good flavour of what is possible and what can be expected.