on ‎2018 Feb 01 4:10 PM
Hi, we load data into the hybris db via cronjobs where we usually do not specify any node or cluster details. it was all fine until we observed that duplicate records got created at one specific time where same instance of cronjob got executed twice at the duration (even fraction of time) . we confim this with the logs created at cronjob level which is duplicate with the same file.
Please help me understand what exactly could be the reason for this duplicate records insertion
Request clarification before answering.
Could you elaborate on how you are performing these inserts? Also is it possible multiple records came in the data itself?
Regarding the logs being displayed twice, we saw this recently in our application. This could be due to the new way sl4j works with different loggers. Probably for the same package prefix, there are more than one logger configurations. I would suggest to check this.
I would recommend debugging in local if possible to trace and see what could cause duplicate insertions. Also this will clarify if for one statement execution, log in log file is printing twice. Node ID should not matter as the cron job will execute on any node but just once. This wouldn't be the case for hot folder as any of the nodes will attempt to pick the files.
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.
hello, thanks for responding to my query. duplicate records re not part of data feed file.We have a cronjob which runs every 15 mins, it has been running since a year but on couple of occasions , we have seen duplicate records being created with a different PK . Not sure what causes .
can it be node issue? threading issue? as the action is insert update .
Hi Pardha,
It is a thread issue , these kind of issues are rare but will happen when multiple threads insert recors at same time. Unique=true for a attribute in items.xml will not create database level constraint.
One of the solution is , you need to create database level unique constraint.
Architectural suggestion: There should be dedicated admin node for data import , running cronjobs etc.
Regards,
Similar issue is highlighted in the following thread
You must be a registered user to add a comment. If you've already registered, sign in. Otherwise, register and sign in.