A system is as strong as its weakest link – this is a well-known fact.
In the integration domain we connect applications and data, in an integrated system capable to seamlessly perform business processes within different connected applications without interruption and human in the loop (popular wording in the age of AI) action. Of course, this is not an official definition – and there could be many of those (definitions), more-or-less formal.
The important thing is to understand that integrated system of applications will connect many applications and various data flows, usually using various middleware IT components.
We do run Unit Testing, Functional Testing or System Integration Testing on our applications or our connected applications – but is this enough?
For the new application, it is just not enough to test only if it fulfils functional requirements – the same way, for an integrated system, with several applications and one or more middleware IT component, it is not enough to test only if the integration works. We need to understand if our application or integrated system can meet specific non-functional requirements. Can it perform? And what are the limits it can sustain?
Yes, I am talking about Performance Testing, Load Testing, Stress Testing and more…
We do those things with applications, but are we following the same route with integrations and integrated systems?
We should…
But let me go first through some general intro – what are the different types of Performance Testing and what are the appropriate Testing Methodology to apply – disregarding if we talk about testing of individual applications, or integrated systems with middleware flows.
I am not going to go with definitions what is Unit Testing, Functional Testing or System Integration Testing, let me focus only on the family of Performance Testing.
While there are many definitions how to split Performance Testing to several distinct types[1][2][3[4], I will stick to my usual habits and stay with the traditional one from IBM[5].
On top of these testing types, worth mentioning, as part of Stress Testing, we also perform Reliability Testing as well, verifying how the system will recover from the “break” situation – i.e. if the specific service goes down, we do not want to lose any messages in between.
Testing methodology depends on the overall development approach, but the most common approaches are:
However, no matter with development approach we practice, key testing goals always stay the same:
Now, I have deliberately avoided saying –only functional requirements are business requirements. In fact, non-functional requirements for performance can very often be very much business relevant, as business can set same clear business driven SLAs.
SLA examples | KPI examples |
4s average response time | Last month we achieved 4.87s average response time |
99% of orders must be received and processes within 8s | Last quarter we had 97.4% orders processes within 8s |
1 000 000 orders per day without degradation of service | Yesterday we have processed 1 002 158 orders, where all SLAs are kept |
… | … |
What does this tell us?
By clearly understanding SLAs, we can define appropriate Performance Testing measurements (and scripts), something we want to test and what is the level of service we need to achieve with our new application or integrated system.
Please note, focus here is on Performance Testing, so I am not addressing other non-functional requirements, although we may apply similar approach for them as well (bur the actual testing or verification may be significantly different).
For the Performance Testing in integration (or in general), the first step is, of course, gathering all non-functional requirements like SLAs indicating i.e. response time, error rate etc. This is the moment to collect all
Let’s have one thing clear – when testing integration, we are testing inbound and outbound, complex multi system flows and respective endpoints of the Provider and Consumer(s). But we are not testing the actual business process within Provider and Consumer(s)– this should be covered by relevant application testing.
Let us plan what types of Performance Testing we need to execute (Load Testing, Scalability Testing, Spike Testing, Volume Testing, Endurance (or Soak) Testing and/or Stress Testing) and create realistic workload models.
In realistic terms, for integrations, we may stick with only few types of testing, combining necessary testing requirements:
There is a difference if we are testing Sync API or flow or Async API or flow[7][8].
Figure 1. Sync vs. Async
All clear, but how does this impact on our Test Design for the Performance Testing? Let's dig deeper...
Figure 2. Example of Sync flow with SAP Integration Suite (CPI and API-M)
Sync Integration Execution is a single thread – only one operation will run at a time.
Sync means Sync Request-Reply pattern. As long as Sender is waiting for a response from the Receiver (either directly or indirectly) to finalize specific operations, this is considered as Sync processing. We may even have in between some queueing with retry logic (i.e. within SAP Integration Suite CPI flow), but if Sender is waiting for the final response, this is still Sync processing.
While we may collect additional logging from the Receiver and the middleware IT component(s), this would be more relevant from the perspective of monitory and observability, not Performance Testing itself.
Figure 3. Example of Async flow with SAP Advanced Event Mesh
Async Integration Execution is multi-thread – multiple operation can run in parallel.
Async means decouples, and it may be PubSub pattern, or Async Request-Reply pattern. Here situation is a bit more complex, as we need to collect and compare relevant logs.
In PubSub pattern, individual IT components may or may not be set to send appropriate responses or acknowledgements (http, ACK/NACK, QoS) but those responses or acknowledgements are not (by default) propagated from the Receiver(s) back to the Sender – if set, response and acknowledgement is only an information if the next component in the flow have received the messages.
With Async Request-Reply pattern, Receiver will send backward a separate response message, for the received messages. However, this message is also sent as Async API, usually after some processing is being done in the Receiver system. Implementation of the Async Request-Reply pattern is a separate topic not covered in this article – but in general it can be completely separate Async flow, or it could be build using correlation IDs (i.e. using SAP Advanced Event Mesh and CPI[9], or Solace PubSub+ SolClient Asynchronous Callbacks[10]).
In both Async patterns:
Again, we may collect additional logging from the middleware IT component(s), but this would be more relevant from the perspective of monitory and observability, not Performance Testing itself.
Do we test only inbound flow and Receiver endpoint, or do we need to emulate specific processes and actions in the Sender application as well?
Figure 4. What is the scope of testing, what do we script?
Ordinarily we measure performance for inbound flow and endpoint or the Receiver. If the Receiver application is also a Provider, this gives us also the Baseline Performance of the specific Integration Service.
However, if we are introducing new Sender application, SLAs may request we perform Performance Testing on the full process starting from the Sender application.
Let’s also understand, depending on the specific integration flow:
The question is, in case of multiple Consumers as Receivers, do we test them all at once?
Figure 5. Test each Receiver separately
The recommended approach is to test separately for each Receiver. This would give us clear picture of the boundaries for each individual Receiver system.
But can we still have multiple Providers?
Yes and no… In fact, it is possible to have different technical backend systems providing specific Integration Service – i.e. for order taking, we can have two or more SAP S/4HANA backed systems, each servicing different countries or regions, where routing is done in CPI or API-M; but if this is the same Integration Services provided by the same application (even though there are two or more technical systems behind) – in this case, we will consider SAP S/4HANA as one Provider for the order taking Integration Service.
We need sample payloads, but we also need to ensure all necessary Master Data and Organizational Data exists and is appropriately configured – i.e. if we are creating orders we need to have existing SoldToParty, Product, OrdeType, SalesOrganization, PricingCondition (or Promo) etc.
But it’s not only payload itself, we also need to understand if we need to set specific attributes with API call (i.e. within http header). This also needs to be defined upfront.
In some cases, some attributes (i.e. in the http body) are triggered specific processing in the Receiver application – for order taking OrderType can invoke different standard/custom functional modules/processing in SAP S/4HANA. All these needs to be defined upfront.
Most of the test environments are not sized as productive environments and for i.e. Functional Testing this is perfectly fine, but for Performance Testing this may give considerably wrong picture. General recommendation is to run Performance Testing in the test environment (or QA environment) that closely mirrors production, including all IT components and software versions.
How do we do this?
This all depends on the applications and IT components we need to configure. In some cases, it might be rather easy, while in some cases, it might be more challenging. For SAP Integration Suite (CPI or API-M), very common scenario is to have separate tenants but with the same/comparable configuration. For Azure Integration Services (i.e. Service Bus, Functions), it is fairly easy to temporarily change the licensing model of the test environment/subscription and assign it the same power as the productive environment/subscription. Similar is for most SaaS applications is general – it’s all about temporarily configuring the subscription, and if we keep the time window for Performance Testing rather narrow, this will not significantly increase the subscription costs.
But in some cases, this may not be so simple. In case of SAP Advanced Event Mesh, it all depends on the deployment strategy of the broker:
While for most of IPaaS or SaaS applications and IT components there is a way to (at last) temporarily configure test environment to match the productive one, in some cases it might simply not be feasible – especially for on-prem system deployments.
What do we do?
There is no golden rule – but there are some workarounds steps we could do.
# | Step | Example |
1. | Measure system performance | Let’s measure performance of similar services in the test and productive environments – i.e. for SAP environment use Workload Monitor ST03/ST03N for measuring the response time distribution for various task types (like dialog, background). |
2. | Measure program runtime performance (optional) | Optionally, run detailed analysis of specific programs – i.e. for SAP environment use Runtime Analysis (SE30/SAT) for ABAP programs to measure execution time of individual statements, function modules, and database calls. |
3. | Measure database performance (optional) | Optionally, run trace on specific performance-related SQL activities – i.e. in SAP environment use Performance Analysis (ST05) to measure where is what time spent on which activities. |
4. | Calculate productive vs test environment processing power | Use all measurements to calculate realistic processing power of your test and production environment – i.e. Workload Monitor, Runtime Analysis and Performance Analysis will give some different values showing that productive system is faster. Example: |
5. | Extrapolate and adjust test results on the test environment | Extrapolate all Performance Testing results obtained on the test environment with calculated factors. Example: |
Now we need to start using specific testing tools like JMeter, Azure Load Testing (also using JMeter scripts) or LoadRunner. Goal is to create scripts to simulate specific actions and interactions which we want to test.
Let’s go through inputs we have collected:
# | Input | Example |
1. | SLAs | For the order taking API: |
2. | Volumes | Annual average 160 000 orders per working day; |
3. | Business patterns | 80% or orders are created doing extended working hours from 10:00-22:00, out of which half is created in the evening 19:00-21:00 |
4. | Systems under test | S/4HANA API_SALES_ORDER_SRV Sales Order (A2X), single cluster, no policy routing; |
5. | Users | No business user. testing integration only, |
Based on inputs we will build appropriate Load Testing script:
# | Script | Example |
1. | Target | Response time percentile 50 should be below 4s; |
2. | Capturing Test Results | Catch the information about sent messages from the Sender side (i.e. testing tool): number of messages sent, start time (sending) and stop time (sending); In case of Async API, catch the overall status on the Receiver system logs on successfully received/processed: number of messages received, overall timing from start to end, and catch response status as well if response/acknowledgement is enabled; |
3. | Number of Threads | We count for maximum daily volume + 5 years growth + 50% margin: As we have already included safety margin, we are okay to set: |
4. | Rump-up period | We can use 4s as this is desired average response time, but we will use default 1s for all tests; |
5. | Loop Count | For Load Testing there is no need to loop payloads more than 20-50 times; |
6. | Payloads | Create payloads: Distribute number of items in payloads: In total we may have 20-50 payloads or more; |
7. | Endpoint | API-M SalesOrder endpoint |
7. | Users | No business user; |
How does this look like in practice, and what does this mean?
Figure 6. JMeter configuration example
In this example, I use JMeter[12] as a testing tool of choice:
So, for combined Scalability Testing, we may adjust the script and just increase Number of Threads (to simulate spikes), increase percentage of payloads with extreme number of items (to simulate data volumes), and increase Loop Count (to simulate soak).
However, for combined Stress Testing we should gradually increase Number of Threads, while keeping all parameters steady (as in Load Testing) – to see when it breaks (errors, what kind of errors). The second test would be to gradually increase number of items in the payload, while keeping all other parameters steady (as in Load Testing) – to see when it breaks (errors, what kind of errors). Further investigation on errors and system behavior is needed to verify integration reliability, but this will depend very much on the specific integration flow – i.e. Async flows should normally be decoupled with queues and built-in retry resilience, while Sync flow normally receive error response and application/user decide next action
We have designed and created scripts for Load Testing, combined Scalability Testing and combined Stress Testing.
As we are simulating real time scenarios, tests should be performed following realistic conditions:
# | Condition | Example |
1. | Applications | No other users should execute the same integration flow which is under test; |
2. | IT components | No other users should execute the same integration flow which is under test; |
3. | Execution timetable | Tests should respect the business pattern of operations. We have three distinct business patterns, and we should run all tests during each business pattern: |
After execution of all tests, we have to conduct appropriate evaluation of results:
# | Evaluation | Example |
1. | Load Testing | As per SLAs evaluate actual percentile 50 and 99 for all Test Executions we did (and we have at least 3 runs, one for each business pattern); |
2. | Scalability Testing | Analyze percentile 50 and 99 for all Test Executions we did (and there could be many runs); |
3. | Stress Testing | Monitor Test Executions we did (and there could be many runs); The observations will be used to define the boundaries of the integration flow i.e.: |
For percentiles, we can use aggregated Test Results report.
Figure 7. Percentiles example
This example graph shows that average response time, or percentile 50, is around 3s, while percentile 99 is around 4.7s.
JMeter may provide number of possibilities to calculate percentiles form the aggregate reports, or we may simply go for some of the add-on graph reports and include it in the Test Plan[13].
Remediations?
Of course, most likely your first round of Performance Testing will not provide fully satisfactory results. The next steps are mostly in identifying what optimization potentials there are, work on it, and then re-run the tests. The good this is – all the scripts are already there, so there is no need to re-do all from scratch.
However, if (due to whatever reason) SLAs are re-negotiated and changed – in that case, scripts will also have to be adjusted.
Why am I writing this article?
While most of the Project Management and Test Management routines are very much highly regulated, project teams might be (often?) facing lack of some specific guidelines how to test integrations, especially its performance – as integrations are, lets be honest, rather specific area…
Well, this is at least my view…
In this article, I have used examples with SAP S/4HANA, SAP Integration Suite (CPI and API-M), SAP Advanced Event Mesh and JMeter – but principles are basically the same, no matter if we use SAP or non-SAP applications and IT component.
Anyway, as already indicated – there is no golden rule – this is just one possible approach to organize our Performance Testing for integration. Of course, this is not a rule book, and things should be adjusted to the specific needs. As always, this is just a potential guideline – nothing is carved in stone…
*) Intro photo by Adi Goldstein on Unsplash
**) This article uses SAP Business Technology Platform Solution Diagrams & Icons as per SAP Terms of Use governing the use of these SAP Materials (please note, newer version of the Solution Diagrams & Icons, as well as Terms of Use, might be in place after the publication of this article).
More guidelines on Solution Diagrams & Icons can be found in this article by Bertram Ganz.
[1] Queue IT: https://queue-it.com/blog/types-of-performance-testing/
[2] Microsoft Learn: https://microsoft.github.io/code-with-engineering-playbook/automated-testing/performance-testing/
[3 Microsoft Learn: https://learn.microsoft.com/en-us/azure/well-architected/performance-efficiency/performance-test
[4] JMeter: https://www.f22labs.com/blogs/mastering-performance-testing-with-jmeter-a-comprehensive-guide/
[5] IBM: https://www.ibm.com/think/topics/performance-testing
[6] Spiral model: https://en.wikipedia.org/wiki/Spiral_model
[7] How to build an Integration Architecture for the Intelligent Enterprise: Part 1
[8] How to build an Integration Architecture for the Intelligent Enterprise: Part 2
[9] SAP AEM Async Request-Reply: https://community.sap.com/t5/technology-blog-posts-by-members/implement-request-reply-integration-pa...
[10] Solace PubSub+ Async Request-Reply: https://tutorials.solace.dev/c/request-reply/
[11] SAP Event Add-on: https://community.sap.com/t5/technology-blog-posts-by-sap/cheaper-than-you-think-the-commercial-mode...
[12] Apache JMeter: https://jmeter.apache.org/
[13] Apache JMeter Test Plan: https://jmeter.apache.org/usermanual/build-test-plan.html
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
| User | Count |
|---|---|
| 33 | |
| 28 | |
| 24 | |
| 14 | |
| 13 | |
| 12 | |
| 11 | |
| 11 | |
| 9 | |
| 8 |