cancel
Showing results for 
Search instead for 
Did you mean: 

Performance issue when shipping data out of BW/HANA to Azure DataLake

former_member362452
Participant
0 Kudos
982

Hello Experts

Within our project we are shipping the data from SAP BW/HANA to Azure DataLake Gen 1 and witnessing severe performance issues with the overall runtimes for data transfer. Options we have tried so far include

Option 1 - Azure SSIS IR

SAP BW/HANA Hosted on Azure VM - North Europe > Calculation View (170 Columns - Attributes & Metrics, 5 Million Records) > SSIS Packages Hosted on Azure SSIS IR - West Europe > Azure DataLake Gen 1 - West Europe

Total Runtime - 88Mins

Option 2 - Azure Data Factory V2

SAP BW/HANA Hosted on Azure VM - North Europe > Calculation View (170 Columns - Attributes & Metrics, 5 Million Records) > Azure Data Factory V2 with Self Hosted IR - West Europe > Azure DataLake Gen 1 - West Europe

Total Runtime - 80Mins

BW/HANA System Details

SAP HANA - 1.00.122.23.1548298510 (fa/hana1sp12)

Platform - SUSE Linux Enterprise Server 12.4

ODBC Driver - HDB_CLIENT

HANA ODBC Version - 1.00.120.139.1589320942

We reached out to Microsoft support and they seem to suggest this is to do with ODBC driver! Has anyone come across similar situations and what was done to improve the performance?

PS: We tried to tweak the config./parameters at the SSIS Package level (Engine Threads, Batch Size, Buffer Size..)too but that had no noticeable impact on the performance

Thanks

Abhishek Shanbhogue

Accepted Solutions (1)

Accepted Solutions (1)

former_member362452
Participant

Posing this as an answer for the benefit of others who have similar situation. We have done various tests and could conclude the following

- Leaner the query better the performance, In our case we were trying to read 177 columns from HANA view and that was degrading the performance. If I query the same view for 60 columns then its performance is much better

- Splitting the columns into smaller chunks and then merging is another option but this will still have some performance implications

- Changing the packet size from default value could have significant performance benefit but this again works with lean query and not for a wide column set. We tried with PACKETSIZE = 2; in the connection string and this went well

Answers (0)