on ‎2013 Jan 09 1:36 PM
Hi,
As mentioned in this other discussion http://scn.sap.com/thread/3252533 I'm testing SAP HANA on AWS.
Currently I'm experiencing very slow performance on UPDATES, so slow that I'm quite sure i have something wrong but I can't figure what is going on.
I'm currently working on a COLUMN table like this, the primary key is (PROGR asc, JOB asc).
The tests where performed filling the table with 1M, 5M, and 100M records. Here's a snapshot of the content of the table:
Through ODBC i connect a VB.NET application to my HANA developer instance on AWS.
The application spawns X threads, each thread creates his own connection/command/reader objects.
Each thread loops through the "JOBS" he's been assigned (eg, thread A works on records having JOB = 00000001 to JOB 00000050, thread B from 00000051 to 000000100 and so on. The number of "JOBS" The number of "JOBS" assigned to each thread is given by TOTAL_NUMBER_OF_RECORDS / NUMBER_OF_THREADS )
here's some pseudo code of the action deone by each thread.
FOR EACH JOB ID ASSIGNED
FOR K=101 TO 200
UPDATE THE RECORD WITH "JOB" = CURRENT_JOB_PROCESSED AND "PROGR" = K
SET NUM1 = 1, NUM2 = 2, NUM3 = 3, NUM4 = 4, NUM5 = 5, NUM6 = 6, NUM7 = 7, NUM8 = 8, NUM9 = 9, NUM10 = 10
NEXT
NEXT
The same thing is done on a MSSQL Server, which does the job faster.
I am sure that i have something not configured well because the gap beetween the times is huge.
I disabled savepoints and log on HANA just to take out disk issues on HANA's side, before running the update test i explicitly load the table in memory (just to be sure). I was reading some docs to understand the DELTA MERGING thing, if iI have to configure it somehow etc, i was thinking about ROW/TABLE locks and so on.
So, i have concurrent updates on the same table, each update works always on different records (the records to update are previously divided among the threads using the KEY of the table), the values updated are on colums that are not indexed nor in priamary key).
HANA is very fast on same testing but for SELECT queries, but on UPDATE something is going on.
Any ideas?
Thank you all,
david.
EDIT: the instances on AWS on which i'm using HANA and MSSQL are the same m2.4xlarge
Message was edited by: David Mandujano
Request clarification before answering.
Hey David,
not sure how MS SQL handles this, but why don't you run the updates as batch updates?
This would at least eliminate network round trips for every single update.
Another thing is: do you use parameterized/prepared statements? You know, the ones with the ? instead of the actual values?
If not, you definitively should do so. HANA doesn't automatically convert your literal-statements (Oracle can do that, MS SQL: no idea).
Running the same SQL statements just with different set or condition values over and over again is really not the best choice performance wise.
Switching of the savepoint is not going to help you here - it's a completely asynchronous process anyhow. No SQL statement would actually have to wait for long until the savepoint is done (actually, there's just a super short time frame - the critical phase - where change operations are halted during the savepoint. But that has (not yet at least) been an issue for any HANA user).
Same thing with delta merging - just not relevant for the update command.
- Lars
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Hello David,
no SAP supported DBMS "hits the disk" for inserts.
All actual data manipulation is done in memory - otherwise all those systems would be terribly slow.
Explicit loading really is not necessary - the insert simply can't operate on data not in memory.
So, without detailed run time analysis I really can't say where or why time gets lost.
- Lars
I think the issue is on parallelisation:
While my App is importing and selecting I can see, using HANA Studio, the threads performing the IMPORT and the SELECT, the number of "HANA threads" is equal to the number of "App threads" of my application.
While the app is performing UPDATE I can see only one or two "HANA threads" while there are from 8 to 20 "App threads" sending update commands.
Is there a way to tell HANA to trust that different "App threads" will never ask to UPDATE the same record?
I am divididing the records to update among the "App threads" using the PRIMARY KEY of the table so if HANA spawns a "HANA thread" for each "App thread" during update no concurrent access to the same record will ever happen.
Reading the documentation I can see that there is no "READ UNCOMMITTED" isolation level.
The reason why you don't see READ UNCOMMITTED is: there is no such isolation level.
Just as in Oracle, it's simply not there.
Due to the multi version concurrency there is no need to have this for any performance reason.
And no, you can't tell HANA that only some rows will be touched by a specific session...
From what I understand of your approach, you maybe get some benefits by partitioning the target table by the column you spread your application threads over.
Finally, about "seeing" the HANA threads being active is easily misleading.
In the standard threads display you only see those threads currently doing stuff in the HANA kernel.
If the statement communication is currently "on the way" to the client, you wouldn't see it in the active threads list.
- Lars
In one application that I made, we did a lot of insert into the database. At first, the performance was really slow. The network usage between the server and the application was minimal.
We accelerated the inserts by using parameterized insert. This gave us some performance improvement, but it was still slow.
As each connection is slow, we created many connection (40 connection) and inserted all the data using all those connection at the same time. This increased performance a lot more and we consider it fast enough now.
| User | Count |
|---|---|
| 7 | |
| 6 | |
| 5 | |
| 4 | |
| 4 | |
| 4 | |
| 3 | |
| 3 | |
| 3 | |
| 2 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.