cancel
Showing results for 
Search instead for 
Did you mean: 

Decouple Sender and Flows for Asynchronous Transfers - Scaling Issues

nils_bente
Discoverer
0 Kudos

Dear Cloud Integration Experts,

While I am aware of the concepts for decoupling sender and flows using JMS queues or data stores described in the link below, I am struggling with applying them on a larger scale.

https://help.sap.com/docs/integration-suite/sap-integration-suite/c5591df1388b4cf08aa3ff9527806b70.h...

Here is my scenario: We are sending a lot of asynchronous ABAP proxy messages from our SAP ERP system to Integration Suite, from where we send the messages to various receiver systems. The messages are received in Integration Suite by an XI sender adapter with temporary storage option “JMS Queue”.

Now suppose there are messages in the JMS queue for two different receivers (receiver A and receiver B) and two different interfaces for each receiver (interfaces A1 + A2, and B1 + B2 respectively).

Initial approach: Direct processing from XI sender queue

Messages are consumed from the JMS queue of the XI sender and are processed and sent to the respective receiver interfaces without further decoupling.

The problem with this approach is that there is no way to prioritize which messages should be consumed from the JMS queue. If a lot of messages for interface A1 are put into the queue at once, and it takes the receiver system a long time to return a response for each A1 transfer, all the messages for interface A2 and for receiver B that have been added to queue in the meantime are stuck until it is their turn to be consumed. With this approach one interface is basically able to block the queue for all other interfaces.

Option 1: Add JMS queues for each receiver system

Instead of sending messages to the receiver system directly after consuming them from the XI sender queue, we create one additional JMS queue for each receiver system and send the messages to these queues for decoupling. In my example, we would create JMS queues “A” and “B”.

Now we put a lot of A1 messages into the XI sender queue again. These will be forwarded to the “A” queue quickly and no longer block the XI sender queue for messages to receiver B. However, while messages for interface A2 will also be forwarded quickly from the XI sender queue, they will get stuck in the “A” queue behind all the A1 messages.

So if we rely on a having a near-real-time transfer for specific interfaces, this approach will not solve all our issues.

Option 2: Add JMS queues for each interface

Instead of creating JMS queues for each receiver system, we create one JMS queue for each interface. In my example, we would create four JMS queues “A1”, “A2”, “B1”, “B2”.

While this approach will solve the issues described above, it will not scale at all unfortunately, as the number of JMS queues is limited and we will quickly run out of JMS queues when adding additional interfaces.

Option 3: Add Data Stores for each interface

Instead of forwarding the messages from the XI sender queues to other JMS queues, we will forward them to Data Stores instead. There is no hard limit to the number of Data Stores, so this looks like an approach that can potentially scale very well.

But using Data Stores has one major drawback from my point of view. The Data Store sender adapter does not allow parallel processing. While a JMS sender can be configured to consume multiple messages in parallel, the Data Store sender will only pick up the next message after the previous message has been processed.

With that approach it is not possible to send multiple messages for the same interface in parallel to the receiver system. But if the receiver system supports that, we absolutely want to do that when transferring lots of messages at once to significantly shorten the total runtime in these scenarios.

In theory we could possibly forego the Data Store sender adapter, and instead select multiple messages manually from the Data Store and split them. But this looks like an awkward approach. The select would have to be triggered by a timer every few seconds, and this will congest the message monitoring. There is probably also additional development needed to ensure that only those messages are finally removed from the Data Store that were processed successfully after the split, while the error messages must retain at the data store and should be repeated with exponential backoff - an option that is already built into the sender adapter for good reason.

Summary

All in all, it looks like we must mix these different options on a case-by-case basis in our landscape. But the decision must be made at design time for an interface and there is no way to react to dynamically changing message volumes.

  • Use JMS queues, but limit the number of queues while at the same time somehow make sure that one mass transfer will not block your other interfaces - which it seems is only possible by running these through dedicated queues.
  • Use Data Stores, but live with the fact that parallel transfers are not possible.

Are there any best practices for handling these scenarios, or are there other concepts we could apply?

Kind regards,
Nils

Ryan-Crosby
Active Contributor
0 Kudos

I'd be wary of mixing high volume & asynchronous interfaces with the expectation of near real-time.

nils_bente
Discoverer
0 Kudos

Of course I am not expecting near real-time transmissions for the interface where an unexpected high volume load is happening.

What I want to avoid is having other interfaces affected by an unexpected high volume load by one interface which will happen if they share the same JMS queue. But separating all the interfaces into different JMS queues is not possible due to the limitation of the number of JMS queues that can be created.

This is not a problem in other middleware systems like PI/PO. In PI/PO it can be ensured that one interface cannot occupy all worker nodes, so other messages are still processed in a timely manner while the high volume load is running.

Accepted Solutions (0)

Answers (4)

Answers (4)

adam_kiwon3
Active Participant

Hi Nils, I agree - it is not really flexible to design for that, but unexpected loads are fortunately rather rare.

However, SAP PI, being a proprietary solution, is able to manage this well, whereas SAP CI is more an orchestration/mediation component and can also queue, but the full decoupling and flexibility (you might want) is handled via PubSub brokers like Event Mesh/Solace/Kafka/....

My recent experience is that we are more and more going towards a decomposition of integration components, which mostly have to be used in combination, consisting of API-M, Brokers and Orchestration/Mediation capabilities. No more "monolithic" ESB only. And yes, this is scary from an operations as well as from a transparency perspective. We do our best to help on transparency (WHINT), but operations (for tracking and tracing messages) will have to be solved with additional logging tools like Datadog/Splunk....

Best regards, Adam

Sriprasadsbhat
Active Contributor
0 Kudos

Hello Nils,

Already lot of answers from fellow community guys and my few cent would be using event mesh or advanced event mesh where you can play around with queue properties to handle this priotization(i think there is standard property called priority if i am not wrong) and sizing is not at all an issue.

Regards,

Sriprasad Shivaram Bhat

VijayKonam
Active Contributor
0 Kudos

I would use EventMesh to decouple these high frequency interfaces. Hope you are not on NEO. PO was one absolutely designed system handling async messages amazingly. Unfortunately we are falling into the lots of usual middleware systems where the developer has to worry about queueing and sequencing etc.

adam_kiwon3
Active Participant
0 Kudos

Dear Nils,

Before going into the beauty of real PubSub concepts (with Event Mesh or other brokers), I would recommend the following:

  • Separate "regular" interfaces from "high-load" ones. Like this you can queue all regular flows in the (generic) XI Sender JMS queue (or even Data Store).
  • For high-load interfaces you should go for JMS (in the way you require parallel/independent processing, mostly by receiver is enough: Option 1, but might be by receiver interface as well: Option 2). You can use up to 30 queues (upgrade easily to 100 and even go beyond). Data store (Option 3) is not recommended to the DB limit of 32 GB per tenant and overall JMS is faster than Datastore (it's due to the protocol). Please find Mandy's blogpost here how to extend: https://blogs.sap.com/2017/10/04/cloud-integration-jms-resource-and-size-limits-in-cpi-enterprise-ed...

Best regards, Adam

nils_bente
Discoverer
0 Kudos

Dear Adam,

Thanks for your feedback!

I will probably follow your advice and use JMS queues for all potential high-load interfaces where parallel processing of multiple messages will be required and think about the best way of separating the queues while keeping the number of queues reasonable.

For interfaces with lower volume where no parallel processing is required I tend to use Data Store as decoupling option then.

What disappoints me a bit is that I must make these decisions at design time based on factors like message volume which may dynamically change later during runtime.

Basically, all I want to achieve is a setup where one interface can suddenly create an unexpected load with a very large number of messages, but does not completely block other interfaces for a long time. Having a feature in the JMS queues where messages can be prioritized would probably allow me to keep the design simple (using far fewer queues and use JMS queues for all the asynchronous interfaces) while still being robust to varying message volumes.

Until then, using separate JMS queues seems to be the way to go.

Kind regards, Nils