cancel
Showing results for 
Search instead for 
Did you mean: 

6.6/6.7: Why is task.polling.interval.min seemingly not respected? (order process takes minutes)

Former Member
0 Kudos

Question(s):

1) How do we get tasks runnning in a timely manner under the new AuxiliaryTablesTasksProvider? It seems like task.polling.interval.min, task.polling.interval and task.auxiliaryTables.scheduler.interval.seconds have no impact on scheduling.

2) Why was the change not communicated better? It was announced as a performance improvement, but it totally killed ours.

Context:

We got hit by a very nasty side effect when we recently upgraded from 6.3 to 6.7: processing an order (payment, erp integration, etc) suddently took 60-240 seconds, which is utterly catastrophic.

After some scrambling we found that the problem lies in the process engine, as our order process is indeed modelled as a hybris process. We had lowered the property task.polling.interval.min from 10 to 0 to complete the order process - which has 5-8 process stages - in a timely manner. This had worked very well for over a year.

Enter 6.7 and the extremely long order processing times. After some digging we found that the new process management (AuxiliaryTablesTasksProvider) was wrecking us, as it doesn't seem to respect task.polling.interval.min. This is counter to what was experienced in https://answers.sap.com/questions/12767994/view.html and https://answers.sap.com/questions/12767510/view.html where lowering it helped. But we already had it low!

There's no indication that this would happen anywhere. The closest I can find is in the 6.6 release notes (https://help.hybris.com/6.6.0/hcd/468fd08eae244cc7b959f7ace336951b.html), which say

In this release, we have improved the performance of the task engine. We introduced a new index based on a new attribute that improves indexing efficiency. We have modified the query responsible for searching tasks so that it fetches more frequently the newest tasks. You can configure the threshold on runtime.

but it nowhere on the wiki is something other than task.polling.interval.min mentioned, nor information that the changes might completely kill performance. The 12 properties of AuxiliaryTables are described on the wiki, but it's not clear exactly how they impact performance. Even reducing the one that seemingly matches interval.min (task.auxiliaryTables.scheduler.interval.seconds) to 1 doesn't help with speed.

Finally we changed the taskProvider bean to point to the BufferedTaskProvider which respects interval.min, and everything went back to normal. But I'm guessing AuxTables was introduced for a reason, and we'd like to know how to get it to work for us.


There are three points to this question: get some answers, warn others, and vent some steam. I'm not a happy camper right now 😞

Accepted Solutions (0)

Answers (3)

Answers (3)

amichele
Explorer
0 Kudos

Thanks Axel. So far so good....

amichele
Explorer
0 Kudos

Thank you for asking and posting your workaround. I did the same, and worked, but actually it seems there is a hole in specs. Did you experiment problems or side effect with this approach ? (Mainly in a clustered environment)

Former Member

In the 1808 release they reverted to the old provider, writing (see https://jira.hybris.com/browse/ECP-2770) about the auxProvider: "This implementation is treated as experimental and the defaultTasksProvider should be used instead. Using auxiliaryTablesBasedTaskProvider could lead to unexpected behaviour.". Let's just say I have strong feelings about this, and not positive ones.

In our cluster cluster store nodes create order processes fulfilled by backoffice nodes. We no problems with the defaultTasksProvider; it's humming along as before. I expect it'll work just as before for you too.

0 Kudos

It is for the strategy of AuxiliaryTablesSchedulerRole, there are two important config for this strategy, task.auxiliaryTables.scheduler.interval.seconds and task.polling.interval, this strategy is used to cluster env, refer https://help.hybris.com/6.7.0/hcd/5317b05eadb44cbaae57725ed8250d10.html,

and task.polling.interval is worked as wait time of polling thread, and task.auxiliaryTables.scheduler.interval.seconds is worked for getting lock,

if you change this value to 0, your cluster server will got unique key error from db sometimes and every node will got the lock each time, this value need more than task.polling.interval so the lock will effect, and the value of task.polling.interval is not recommended to be too small, it will degrade server performance.