on ‎2010 Aug 25 11:49 AM
We run PCo 2.0.1.8 with multiple agents (OPC-DA 2.05a --> MII 12.1.6.Build(91)).
Sometimes our MII-Server is under heavy load and is not able to process PCo messages anymore. Therefore one PCo agent stops working and logs this message:
"System is too busy. Incomming message processing is halted until the system becomes less busy."
The other agents stop working as well. They never recover and start sending messages again.
It is not about retrying to a send old message again. We set "Maximum Retry Attempts" to 3 and "Retry Interval" to 1.
It is about sending new messages after OPC items changed. Our OPC servers change values at least once a minute, which should result in new messages.
Any suggestions to solve this issue?
Regards,
Martin
Request clarification before answering.
How many messages are you sending per minute to MII? It seems unlikely that you could overload the web server, which should be able to handle a very large amount of URLs at any one time, each on its own thread. What OS/DB are you running MII on?
Also - are you trying to create a simple data historian with MII to a database? We never recommend this approach. The data historian market is mature and there are many good ones out there.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
John,
the reason for the overload was not the amount of PCo notifications (max. 40 per Minute). We changed a lot in the MII-System (HW, SW, OS) so that this will probably never occur. But the question was why PCo never recovered from the "System too busy" error.
And you are right about the "simple data historian", but this is only a small part of the MII project. I can't tell the customer to buy another Non-SAP software. This would probably embarrass the SAP guy who sold the MII license. I just have to make it work using PCo and MII.
Regards,
Martin
Martin,
Are you investigating why your MII server is getting overloaded? MII 12.1 can use NW CE's clustering ability to do load balancing. Just a thought....
You may want to increase your retry interval since it takse only a few seconds until PCo stops attempting sending the notification (3 tries, at 1 second after failure found may be too impatient for dealing with a slow overloaded MII server).
PCo was designed to be more of a push technology than a constant pulling application. If you need values to be updated in the time frame you mentioned about your OPC server, you may want to use UDS instead. PCo is geared more toward event notification based on expression evaluation (e.g. cylinder temp > 90F or line yield < 30). I am not sure if you want to be sending out alerts every minute with PCo.
Regards,
Kevin
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
Kevin,
of course we try to avoid this error and investigate the reason, but it may happen and we would like PCo to recover from it.
PCo and its push technology is the right choice for us. We have to get informed when an OPC value changed. We do expression evaluation as well, but a few machines change their values so often that we get notifications every 3 to 5 seconds. And this is the reason for the very short retry interval of 1 second and 3 retry attempts in this case. If the notification comes too late, the information is outdated and we don't need it anymore.
The retry interval is not the point. When the MII server becomes less busy and is ready to process new notifications again, PCo does not even send new notifications. The agents have to be restarted to work again. That's our problem.
Expected behavior of PCo
00:00 OPC change --> MII idle --> Send Notification --> OK
00:03 OPC change --> MII busy --> retry 3 times, then throw away --> OK
00:06 OPC change --> MII still busy --> retry 3 times, then throw away --> OK
00:09 OPC change --> MII not busy anymore --> Send Notification --> OK
...
Reality
00:00 OPC change --> MII idle --> Send Notification --> OK
00:03 OPC change --> MII busy --> retry 3 times, then throw away --> OK
00:06 OPC change --> MII still busy --> PCo does nothing --> BAD
00:09 OPC change --> MII not busy anymore --> PCo does nothing --> BAD
...
Best regards,
Martin
Martin,
I agree PCo should recover from this situation. The Retry Interval is very important in this scenario. Once a notification exceeds the number of Retry Attempts, the message goes into the failed queue. At that point, manual intervention is needed to resend the message to MII (on the Message Failures tab of the Agent) . So, it is crucial that the max attempt and retry interval parameters are set correctly so that manual intervention is reduced. I assume you have your Failed Message Persistence to KeepAll since you want all messages to be queued. After the Lifetime settings have reached for the notification, the failed messages will then be put into the Expired Messages queue (on Agent's Expired Messages tab). That is how messaging works in PCo.
For the notifications, I would check your trigger expression and verify the trigger type. I assume its Always. Increase the Retry Interval and verify the other settings as well. You may need to cleanup the Failed and Expired messages.
The agents could be the issue if they all point to the same MII server (destination). They could overload it, depending on the number of agents, at the frequency of values you have. Keep in mind that each agent is an independent service on the box that PCo is installed. It also depends on your system landscape. Is PCo installed on the same box with OPC or historians? PCo could be impacted by other software running.
Regards,
Kevin
Kevin,
the Failed Message Persistence is set to KeepLast, because we don't need outdated notifications. PCo/MII has to process the notification in time or not at all. But I changed the Retry Interval from 1 to 5 seconds and reduced the Maximum Retry Attempts from 3 to 2, to give MII some more time to recover. Maybe it helps.
You are right about the Trigger Type. In this specific case it is set to Always. All the agents are sending to the same MII. PCo has its own machine, without any other processes (OPC, NetWeaver/MII, Database) running on it. (max. 1.7 of 6 GB RAM full, max. 10% CPU used).
Regards,
Martin
| User | Count |
|---|---|
| 1 | |
| 1 | |
| 1 | |
| 1 | |
| 1 |
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.