a month ago - last edited a month ago
We faced a problem with low throughput in SAP HANA, measured in statements per second executed by sql_executor threads.
Metrics: The current productive system runs on Oracle 19.24, where we can achieve throughput of 800,000 user calls per second. We consider Oracle user call per sec and HANA statements per sec to be equivalent. Additionally, throughput can be monitored by the number of network packets per second at the OS level (for example, using nmon).
Test description: The test is multi-threaded. From a client host running a Python script, we execute queries like SELECT ID FROM PYTHON_MON.ZBC_TEST_N_<from 0001 to 00230> to one of the HANA nodes. Each query always returns one row; each table contains one row. The number of tables used in the test varies from 120 to 1500. The issue becomes noticeable already at 150 active threads. The HANA node is a P10 server with 16 NUMA nodes, 32TB RAM, SMT8 (we also tested SMT2 and SMT4). In this configuration, SAP HANA can reach about 100,000 statements/sec. Tests were also done with real data in ABAP, selecting from tables like MARA, MARC – the throughput was also unsatisfactory.
Parameters that positively affect throughput:
composite_statistics_striping=numa
resource_tracking=false
statistics_type=7
Setting these parameters on an empty database without much data loaded can improve performance by 2–2.5 times.
Hardware throughput test:
We tried creating up to 16 index servers on the HANA node,
binding CPU cores to a NUMA node.
Example: ALTER SYSTEM ALTER CONFIGURATION ('daemon.ini','SYSTEM') SET ('indexserver.H49', 'instance_affinity[3]')='480-719';
All test tables were located on their respective index servers, and the load from clients was sent to their assigned index servers.
In this configuration, a single HANA node on an IBM Power P10 server delivered a total throughput of 4,200,000 stm/sec across all index servers.
Query execution speed in a multi-threaded test:
Average selection speed was checked with the script "HANA_Connections_Statistics_2.00.060+.txt"
fields AVG_SEL_MS+AVG_PREP_MS.
In Oracle, we used a script based on SQL cache or AWR:
select Round((elapsed_time/DECODE(executions ,0,1,executions))/1000000,4) avg_elap_sec,s.* from v$sql s where s.SQL_ID = &sqlid.
When SAP HANA’s overall throughput degrades, the speed of individual queries does not decrease. The load on SAP HANA does not scale. No matter how many client connections are opened, only about 100–120 active threads on the server are doing useful work.
What we tried from SAP Notes:
We checked all available notes related to the issue.
For configuration we used:
"HANA_Configuration_MiniChecks_2.00.080+.txt"
"HANA_Threads_Callstacks_MiniChecks_2.00.060+.txt"
"HANA_Configuration_Parameters_2.00.040+.txt" and others.
We analyzed RTE dumps and perf traces at the OS level.
More than 500 SAP HANA parameters recommended in SAP Notes were tested.
SAP HANA releases:
The issue exists on all versions we tested.
Current version – 2.00.081.00.1733303410 SUSE Linux Enterprise Server 15 SP6 6.4.0-150600.23.25-default
We tested on 79, 78, 37 with different SPs.
Servers: IBM Power P9 and P10.
Operating systems: SLES, RHEL.
Request clarification before answering.
User | Count |
---|---|
73 | |
21 | |
9 | |
8 | |
6 | |
6 | |
5 | |
5 | |
4 | |
4 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.