[rabbitmq-discuss] Queue data recovery after master failure

Andrei D. theonlyandrei at gmail.com
Tue Jul 15 20:34:31 BST 2014


Great, I think we can make that assumption (slaves down) in the scenario I
described.
I'm thinking the recovery procedure would look like this:
1. power up all nodes without starting rabbit; say node X doesn't come up.
2. start rabbit on all the nodes that were not a slave for (any queue on) X
3. run the new and improved :) forget_cluster_node X. -> this should promote
some (offline) slave S as master
4. start rabbit on S (and the rest of the nodes) which should now be master
and have all the messages it had when the cluster went down.
Assuming the above should work (could you kindly confirm?), what do you
think the ETA would be for the next official release that would include the
required forget_cluster_node fix? (the one that's already in the nightly
build)
Thanks!
Andrei



--
View this message in context: http://rabbitmq.1065348.n5.nabble.com/Queue-data-recovery-after-master-failure-tp36298p36808.html
Sent from the RabbitMQ mailing list archive at Nabble.com.


More information about the rabbitmq-discuss mailing list