Additional Blogs by SAP
cancel
Showing results for 
Search instead for 
Did you mean: 
tfxz
Product and Topic Expert
Product and Topic Expert
0 Kudos
725

Comparing the BI Accelerator (f.k.a. HPA) to Traditional RDBMS Technology

NW 2004s comprises a new BI accelerator (BIA),  formerly known as High Performance Analytics (HPA).  This new functionality allows to boost the performance of  multi-dimensional queries (of NW2004s BI, i.e. BW) by factors  rather than just margins. At the same time, it reduces the  administrator's efforts to achieve such outrageous performance  to a minimum, especially in comparison to traditional  performance-enhancing methods like precomputed structure  such as database indexes or aggregates (a.k.a. materialized  views or summary tables). For a more detailed description  on the BI accelerator refer to an  article published in the October 2005 issue of SAP Insider.

In this weblog, I'd like to quickly touch a question that  many might have in mind when looking at this new technology:  why the heck is this so much faster in comparison to  traditional RDBMS products? Can't the latter be similarly  fast if there is

  • sufficient main memory,
  • parallelism,
  • and maybe even column-oriented and compressed  structures like bitmap indexes?
These three are often cited to be among the major sources  of performance used within the BI accelerator. There are  more than just those three - like the absence of rollback  and logging mechanisms or processing technology tailored  towards NW BI's star schema - but, below, I will focus  on those three.

Well, in principal, traditional RDBMS have a certain amount  of technology in place that - in the mid- to long-term -  could be evolved into something similar as the BIA has to  offer. Let us simply consider a bitmap index. The latter  can be regarded as a compressed representation of a table  column (in theory, a bitmap index can cover more than one  column but, in reality, this makes hardly sense). Its  predominant advantage is that it can be easily combined  with other bitmap indexes (e.g. on the same table) through  bit operations. So, theoretically, a table or parts of it  can be completely represented by a set of bitmaps. Bitmaps  can be highly compressed and should thus fit into main  memory. Data parallelism in this context is also straightforward.

So: what hinders an RDBMS to get there? Take a typical star  join approach in which bitmaps are used to propagate the  selectivity to the fact table that originates in the filters  that are defined on a number of dimensions. Imagine filters YEAR=2005 and CITY='London'. Each filter in itself  has a lower selectivity than the combination of the filters.  In this example, it is likely that two bitmaps, one for each  filter, would be merged to create the list of rows in the  fact table that need to be aggregated. At exactly this moment,  a switch from a column-oriented towards a row-oriented approach  takes place. Subsequently, processing is row-oriented with all  its overhead due to unnecessary I/O: just consider a fact table  with 20 key figures (measures) out of which only 2 are relevant  to the actual query. In the end, the values of all 20 key figures  are transported from disk to main memory and from there towards  the CPU. Only there, 18 out of the 20 will be discarded. The same  argument applies to dimension columns that are not relevant to  the actual query.

The above describes only one amongst various situations where  the traditional RDBMS approach falls behind. This doesn't mean  that they are all that bad. They are simply so much more generic  that the BI accelerator can ever be. In other words: the BIA  constitutes a highly specialized approach targeting multi-dimensional  query processing which is predominant in BI systems and which is  therefore worthwhile even at the expense of loosing some qualities.  They are simply not required in that context.

5 Comments