cancel
Showing results for 
Search instead for 
Did you mean: 

Cron Job performance

Former Member
914

Is cron job capable of handling huge data like 2 milion records? I have to write a cron job which will run say 5 times a day and process price and inventory table and update into a new table. How much efficient cron job will be to handle such huge data ? Can I get some statistics of the cron job performance ? Will it impact the application performance?

Accepted Solutions (1)

Accepted Solutions (1)

Former Member

This is doable on hybris, but surely a critical amount of data in a - for this - relatively high frequency. 10 million records per day, continuously. It will need careful design of your infrastructure and import process.

Definitely you need separate cluster node for this import activity so that any e-commerce site or business user cockpit node is not affected on the application server level.

For the DB you need to setup an architecture that's performant enough and has room to scale further, make performance tests in a pilot phase and learn early how this setup behaves. Usually in hybris scenarios the higher load is on the application server, but for pure import tasks like prices / inventory it will have more impact on the DB than the average e-commerce traffic.

In parallel think about alternative designs like incremental updates for prices and inventory or to not import the inventory but pull it on demand by REST API, eventually with caching to reduce the traffic on the external system. You can do the same with prices but that's a bit more complicated / more effort.

So bottom line: see to have pilot system of a certain scale in place as soon as possible and test it and then adjust your solution on basis of these learnings, on hardware architecture and on software design level.

Hopefully someone from Hybris can also give some elaborate feedback on your question.

Answers (1)

Answers (1)

Former Member
0 Kudos

Konrad captured the important points. Definitely you want to validate if you really need to send this many changes per day....are these really just changes for inventory and prices or is this all inventory and prices records?

Another option is to use a NoSQL database for inventory and prices that will be very fast to update and has the advantage that it doesn't impact the performance of the primary database when we're importing inventory and prices (essentially we're doing some sharding).

For what it's worth we can import pricing and inventory pretty fast if we optimize the process. For example we did some testing of some very basic hardware and just 4 worker threads and we were able to achieve 3500 items/s inserting prices. Updates are much slower because we need to load the existing data first but we have some ideas on how this could be optmized further it required for specific types (e.g. Prices and Stock)