cancel
Showing results for 
Search instead for 
Did you mean: 
Read only

Ping Cluster blocking server startup

Former Member
0 Likes
1,302

Hi,

during deployment of our project we are shutting down all the servers in the cluster and starting them up in parallel. The problem is that the startup is slowed down by cluster messages like

 hybrisnode-4: JOIN(hybrisnode-4) sent to hybrisnode-1 timed out (after 3000 ms), on try 10

and the server isn't started if the 10 tentatives are reached (in avg can take 10m). Is there a way to avoid this?

Tried already to set

 cluster.ping.load.on.startup=false

but without any success.

Any ideas? Thanks in advance, Regards

Hybris version 6.6.0.2

Accepted Solutions (0)

Answers (5)

Answers (5)

0 Likes

Solution is here https://launchpad.support.sap.com/#/notes/2736516

problem relates to JGROUPSPRING table entries, which is not getting deleted on server restart

Former Member
0 Likes

I checked the table JGROUPSPING and there were way too many records. After removing all of them, now problem seems disappeared.

Former Member
0 Likes

Ok, so you're using jdbcping? It may be a 6.6. issue, I would open a support ticket.

Former Member
0 Likes

Look for problems with the JGROUPS table in the cluster's database if using the stock Hybris Commerce jgroups-tcp config. First the nodes advertise their availability to peers there. Are the nodes listed actually up?

Second make sure your nodes can actually reach each others' JGroups TCP port.

 hybrisnode-4$ nc -z hybrisnode-1 "${hybris_jgroups_bind_port}"

If you're doing this in IaaS cloud, make sure all the VMs have network/firewall policy allowing them to talk to their peers on their jgroups address/port.

Former Member
0 Likes

we are only stopping Hybris process. The point is that configuration did not change but we migrated to version 6.6 from 5.7.

We are using Jgroups via TCP with standard configurations.

Former Member
0 Likes

How did you configure your jgroups, what clustering method and discovery mechanism you use? Does “shut down” here mean killing the node without gracefully stopping hybris?, typically if you redeploy hybris you don’t need to shut down the server just the hybris process.