It started with exploring ways to count number of records in
SAP Business Suite systems tables on SAP HANA database. The records count need to extracted once SAP application is shutdown.
The requirement was to compare records count for datacenter migration. The migration was being performed using backup/restore method.
If this were the R3load based migration, table checker tool part of Migration Checker would be the perfect choice.
Migration checker details :
784118 - System Copy Tools for ABAP Systems
2009651 - Table Checker Tool as Part of a System Copy with R3load
Given SAP is stopped and no luxury of using SAP migration tools, the available option getting limited. Based on gathered details ,there are few approaches found and corresponding table record count extraction is performed .
I will briefly mention the attempted approaches, which met the requirements.
- Using SAP HANA M_TABLES , sap help
Challenge Faced: Though it is fastest way to get the records count for all tables, the table record count was not showing accurate counts, on target systems after restoring.
- Table comparison using Software Update Manager(SUM) : SAP Note 2670131
Challenge Faced : It worked fine for all other tables, but having issue in bigger tables. SAP HANA memory related error occurred while processing of big tables like EDID4,BALDAT,CDPOS etc
- Count individual table records using select count(*)
Challenge Faced: Though it fulfilled the requirement, the runtime was much higher. Initially the runtime was more than 2 hours.
Path to the Other way…
Given the issue faced with first 2 approaches, 3
rd approach was chosen, with aim of reducing overall runtime.
The runtime was reduced :
- The query table spliced into multiple sub-query. Each sub-query holding fewer chunks for count(*) query.
- Each these sub-query was invoked parallel using Powershell workflow. Where each workflow call will invoke SAP HANA HDBSQL. Which will get records count on a table.
- Finally , all sub-query output was merged.
The Other way in Action ...
Microsoft PowerShell scripts was used for performing all activities.
- I have used SAP HANA HDBSQL (Command-Line Reference) for fetched from SAP HANA database.
hdbsql -U DEFAULT -I "sql_file_name.sql"
- Get list of all tables and save locally in file
- All tables name now available, for which the records count need to extracted
- A another sql query will be populated , using table name. This will be used in fetching individual table records.
- Few entries from master query sql file :
select count(*) as "CDHDR" from "CDHDR";
select count(*) as "CDPOS" from "CDPOS";
select count(*) as "CDPOS_STR" from "CDPOS_STR";
select count(*) as "CDPOS_UID" from "CDPOS_UID";
Improvement in runtime ...
- The generated master query sql file was spitted in multiple sub-query files.
- The splitting logic was kept dynamic and based on input given in start of script execution, the number of split can be adjusted.
- Now each sliced sub-query ran parallel using PowerShell workflow.
workflow Table_split_query {
$total_num_split =
$ # Number of splits
foreach -parallel ($i in 1..$total_num_split) {
$path = Get-Location
sleep 5
hdbsql -U DEFAULT -I $path\querytable_$i.sql | Out-File -FilePath "$path\recordCount_$i.txt"
}
}
$ is the number of splits. I have kept number of split higher in Production than in development .
Conclusions
- The requirement met successfully and able to extract records count for all tables.
- The table records count extraction which were taking more time, With splitting of query sql and parallel running of these query , the runtime reduces drastically.
- The records extraction which was more than 2 hours earlier, now taking less than 10 minutes .
- The same extraction logic was adopted in both source and target systems. This makes the task of comparison even easier.