[rabbitmq-discuss] Queue data recovery after master failure

Andrei theonlyandrei at gmail.com
Wed Jun 18 04:43:14 BST 2014


We have a 12 node cluster with dozens of mirrored queues (2 slaves per queue). 
Here's the scenario we're trying to understand how to recover from.

Say we have a complete power failure and when power is restored one of the nodes is dead. 
For at least one queue, that node used to be the master node. The queue is now unresponsive, which is somewhat expected (no failover happened before the crash so we now have 2 slaves). The queue data (messages) must be physically present on at least one slave (at least one of those is a disc node). However, we seem to have no way to recover the queue and keep that data. 
If we bring a new node up to replace the old one (we reset the old one to simulate a fresh node), the queue becomes available but it's now empty (we assume this to be the result of the - now empty - master synchronizing with the slaves, sort of in the "opposite direction" of what we'd like). 
Is there a way to either designate the slave that still has the data as a master for the troubled queue, or to push that queue data to the new (resurrected) node? 

Thanks is advance!

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140617/7d8924df/attachment.html>


More information about the rabbitmq-discuss mailing list