[rabbitmq-discuss] Blocked connections on rabbitmq 2.2.0
s4nchez at gmail.com
Thu Feb 3 20:36:18 GMT 2011
Yesterday I've noticed some very strange behaviour in one of our
rabbitmq cluster nodes. Its queues became unresponsive and running
"rabbitmqctl list_connections" was returning that all the connections
were either "blocking" or "blocked". The documentation doesn't mention
these states. Does anyone know what they mean?
To give a bit of context: we noticed this problem when good portion
of our clients stopped receiving messages. Looking at all the servers
we found one that was using too much CPU and also swapping to disk.
This node was the only presenting this behaviour, but it seemed like
this problem compromised the whole cluster. We haven't touched these
servers for ages (they are dedicated to rabbitmq) and the system load
was completely under normal levels. We use simple DNS round-robin for
clients to connect to the cluster and none of our messages are
persistent, so seeing swapping really scared me.
After restarting the whole cluster a few times the problem
persisted, always on the same server. We even tried force_reset in all
nodes, but that also didn't help. Things just went back to normal
after we removed the problematic node from the cluster. Now my task if
figure out what can be the problem.
Did anyone have experience with this kind of behaviour? I'm even
considering hardware problem, but so far didn't find anything
indicating that was the case.
Any help is welcome.
More information about the rabbitmq-discuss