[rabbitmq-discuss] Node crash, then cluster collapse

carlhoerberg carl.hoerberg at gmail.com
Wed Jun 5 13:56:38 BST 2013


On a three node cluster, one ec2 machine reboots unexpectedly, and when it
starts up again RabbitMQ fails to start. I've put all logs here:
https://gist.github.com/carlhoerberg/ff6c6bd4f7639bf4b2f5

When the troubled node is restarted manually again it's unable to join,
stopping at "adding mirrors", staying there forever. 

The other nodes now start to behave weird too, new queues can't be declared,
but existing queues seems to continue deliver messages. They also can't
respond to "rabbitmqctl status", or /api/overview. I'm forced to stop them
with "kill -9". Only when all nodes are stopped the cluster can be brought
up again normally. 



--
View this message in context: http://rabbitmq.1065348.n5.nabble.com/Node-crash-then-cluster-collapse-tp27206.html
Sent from the RabbitMQ mailing list archive at Nabble.com.


More information about the rabbitmq-discuss mailing list