I was in 2 minds where to put this short blog. Do I put it in the HANA or Enterprise Information Management areas of SCN. I decided to put it in the EIM area as there are already a number of articles in HANA and I wanted to introduce some new capabilities within the more traditional EIM space.
SAP HANA SP09 introduced a whole host of new capabilities. In this blog I’m going to cover 2 of those, Smart Data Integration (SDI) and Smart Data Quality (SDQ), which fall under the umbrella of SAP HANA Enterprise Information Management.
SDI & SDQ have the ability to source data, replicate data, transform and cleanse data in batch or real time into SAP HANA, in on-premise or cloud environments. This provides a simplified landscape where we can provision and consume data.
I’m not going to go into detail about the architecture etc but more information can be found at help.sap.com/hana_options_eim
For those that are familiar with SAP Data Services then the design concepts for SDI / SDQ are similar. We have a HDBFlowGraph (dataflow), sources, transforms and targets.
Transforms are split into 2 main categories, General and Data Provisioning.
General contains the standard capabilities;
Data Source – source table.
Data Sink – target table.
Data Sink (Template Table) – creates a table based on the previous transforms data structure.
Aggregation – creates an aggregated result set based on the specified aggregation method such as SUM or Count.
Filter – filters the incoming result set based on an expression.
Join - combines data from 2 input tables by using values common to each.
Sort – combines data from 2 input tables by using values common to each.
Union – produce a result set from 2 tables with the same schema.
Procedure – call a stored procedure.
AFL Function – Accesses functions of the Application Function Library.
Data Provisioning contains the more advanced transforms;
Date Generation – generates a series of dates.
Row Generation - creates a result set based on a user defined number of rows.
Case Node – used to route records based on value.
Pivot – transforms rows into columns.
Unpivot – transforms columns into rows.
Lookup – retrieves column value(s) from a lookup table that matches an expression.
Cleanse – used to parse, standardise, correct & enrich person, firm, address information.
Geocode – enrich address data with latitude / longitude information.
Table Comparison – compares 2 tables and produces the difference between them flagged as insert, update, delete.
Map Operation – allows you to change the operation codes. Change an update to insert.
History preserving – allows you to produce a new row in the target table rather than update an existing row.
To create a flowgraph we drag a combination of the required transforms on to the canvas a join them together. In the example below I’m joining 3 source tables in SAP ASE_Orders, ASE_Order_Details & ASE_Customers. The customer data is then passed through the cleanse transform where we are parsing / cleansing name & address information before we load the result set into a template table in HANA.
This is just a brief overview of the new capabilities SAP HANA EIM brings in SP09.