You want to receive notifications that messages in your PI Adapter Engines got stuck or hanging and aren't processed anymore. The message might be stuck because it’s not in the processing queue anymore (for whatever reason) or there was an error during processing and now it's waiting in error status. Or it’s maybe an EOIO (Exactly Once in Order) message which is blocked in Holding status because a previous message in the same EOIO context has an error.
You want to be actively informed by the system about such messages so that you can take action, start the monitoring tools, fix the problem and restart the messages. The PI supports an Alerting to send notifications about messages that had an error (CBMA – Component Based Message Alerting). But maybe your particular message got stuck in the system without going into an error status (like the example with the message which is stuck in an EOIO queue or it's in status “To Be Delivered” but the scheduler dropped the message). So the normal Alerting won't help, because it only considers real error status. Or maybe you just didn’t create an Alert Rule for this particular PI scenario and the system won’t even create an Alert.
The PI monitors so far only support manual monitoring. You might have to look into the Message Overview and Message Monitor several times per day to check for suspicious messages. With the newer releases, the Message Overview offers a mode to see the blocked messages. This feature becomes available with the releases and Support Packages described in SAP note 2182422. But this still requires a manual activity and somebody has to look into the monitoring tool.
This article provides a solution to automatically check the Java Adapter Engine for stuck/hanging messages in regular intervals and to send Alerts. It’s based on a background job that you can schedule in your system. The system can run the job periodically and it will send Alerts if it detects such messages. The job is based on the SAP NetWeaver Scheduler for Java, the CBMA (Component Based Message Alerting) and the Message Overview data. Therefore, you maybe have to configure your system accordingly and have these features enabled.
The following sections will show you how to deploy the new job to your system, how to configure it and how to consume the alerts (and send out emails).
Note: The solution provided here is only relevant for the PI Java stack, so you can use it on any Adapter Engine (central and decentral) or in Process Orchestration systems. It does not cover ABAP systems like the Integration Engine or ABAP proxy systems.
Note: The solution provided here becomes irrelevant for release 7.50 with the changes from SAP note 2279043. This note adds the same functionality in the standard product with 7.50 SPS06. However, the note only covers release 7.50 and there was no update for older releases like 7.31 or 7.40.
This article provides a background job, which you can deploy on your system. The job will run periodically and check for messages that are in a non-final status and without an update/change for a certain time and send Alerts for them. Non-final message status on the Adapter Engine are System Error, Waiting, To Be Delivered, Delivering and Holding.
The required configuration steps are:
Download the archive and deploy it to your system
Schedule the job
Configure an Alert consumer to send emails
Step 1. Download the application archive and deploy it to your system
You can find the file “cust.sap.com~pi.alertstuckmsg.ear” in this github repository. It contains the Java classes, descriptors and J2EE application for the new background job. Download and deploy it to your Adapter Engine. You can do the deployment with the standard SAP deployment tools like JSPM, SAP NetWeaver Developer Studio or via telnet. Please be carefull that you download the full EAR file and not just the JAR file that is packed into it.
During deployment, it will install a new Java application in your system. You can find it if you go to NetWeaver Administrator -> Operations -> Systems -> Start&Stop -> Java Applications. The application list should have a new entry with name “pi.alertstuckmsg” and vendor “cust.sap.com”. Please verify that it is started without errors.
The deployment also creates a new background job template, which you can now use to schedule a periodic job. This is described in the next step. The job itself is based on the Java scheduler framework, offered by the WebAS Java. It only uses standard services that are available in all installations and doesn’t require any additional deployments.
Step 2. Schedule the job
After deployment, you can schedule a new background job. The configuration is done in NetWeaver Administrator -> Operations -> Jobs -> Java Scheduler. Here you can find the new job template in tab “Job Definitions”. It should list a new job type with name “AlertStuckMessagesJob” and application “cust.sap.com/pi.alertstuckmsg”.
Switch back to the tab “Tasks” to add a new instance of the job. Click the Add button to start the wizard to configure a new instance of the job.
In the first step “Select Job” select the new job type “AlertStuckMessagesJob” and click on Next.
In step two “Set Details”, give your job instance a unique name and adjust the retention time for the job log entries that it will write on each execution.
In step three “Set Properties” you can set several job parameters. They are described below.
In step four “Set Execution Time” you can configure how often the job will be executed. You can use Recurring, for regular executions each several minutes or hours (“each 60 minutes”) or Cron for executions on a certain time each day (“every day at 09:00 AM”).
After the last step, the new job instance is created and will execute according the scheduled intervals. You can find the new entry in the “Tasks” tab.
The "AlertStuckMessagesJob" has several parameters that influence the execution. Some parameters are used to select specific message scenarios by the header attributes. Others control after which time a message is regarded as hanging/stuck, who will receive the generated Alert and what will be the content.
AlertConsumer: Alerts are sent to this consumer (see documentation of CBMA). You have to configure a job that will fetch them from this consumer queue and to trigger the final notifications. There are for example consumers for the Solution Manager or the email consumer job AlertConsumerJobV2. If you forget to configure some who picks up the Alerts from the consumer queue, then the Alerts will pile up in the queue and require more and more memory and database table space.
StuckAfter_min: Messages are regarded as stuck/hanging if they are in a non-final status and no status change happened after this amount of minutes. If a message is still automatically retried by the system, then it actually has status updates (for example the automatic retry sets it back into delivering status and then it goes again in into error) and so those messages are perhaps not selected, which is probably even desired. The minimum allowed value for this parameter is 5 (minutes). Smaller values are ignored.
MessageAgeLimit_min: The limit for the PI message age in minutes. The age is the time between the messages creation time (when it was first sent) and the current time. Only younger messages are selected. This can be used to limit the selected messages and a message is probably not interesting anymore after it’s not final for already 7 or 14 days. The value for this parameter has to be larger than the value for parameter StuckAfter_min.
AlertWithMessageID: If this is enabled, the job will send individual Alerts for each message ID that it found to be stuck. If it’s disabled, it will only send one aggregated Alert per message scenario (not for each message) with the message counter. It can generate many Alerts if you select to send individual Alerts for each message ID. Therefore, you have to be careful not to flood your system with many Alerts. The parameters MessageAgeLimit_min and MaxAlertsPerScenario can help to limit the number of Alerts.
MaxAlertsPerScenario: This parameter is only used if AlertWithMessageID is enabled. It will set a limit for the number of alerts for individual messages. The limit is applied per scenario and it’s reset for each new scenario. For example, if you set it to 10, you can have maximum 10 Alerts on each job execution for each of your PI message scenarios. A value of 0 means no limit. It is strongly recommended to always set a value greater than 0 for this parameter to avoid flooding the system in case there are many stuck messages.
After the “AlertStuckMessagesJob” is configured, it will send Alerts for the stuck messages and scenarios to the consumer queue from job parameter AlertConsumer. The Alerts will now pile up in this queue until someone actually consumes them and sends out the Alerts notifications to the users.
Another possibility is a PI provided email consumer job “AlertConsumerJobV2”, which is configured in the same Java Scheduler job tool like the “AlertStuckMessagesJob”. The CBMA documentation on sap.help.com and SAP note 2088606 describe this email consumer job.
In case you encounter problems with the “AlertStuckMessagesJob” you can use different sources for error analysis. This can help you to find the reasons why there are no Alerts sent out or if you receive exceptions from the job.
One source is the default trace file of the WebAS Java and the NetWeaver Administrator Log Viewer tool. For more detailed traces use the Log Configurator tool and set the trace location for package “com.sap.pi.alertstuckmsg.*” and the sub-tree to Debug severity. The job will then write detailed entries into the default trace file on each execution, which you can find in the Log Viewer tool.
Another source is the log from the Java Scheduler itself (the same tool where you created and configured the job). You can find it in the log in Java Scheduler tab Jobs. Now select in the first table the entry for the invocation of the “AlertStuckMessagesJob” and select the tab “Log” in the Job Details below the table.
The Github repository “pi_alert_stuck_msg" (link) also contains the full source code for reference.