[rabbitmq-discuss] cluster reboot of death

Mark Ward ward.mark at gmail.com
Tue Oct 30 18:52:41 GMT 2012


The scenario:

Maintenance on servers in a cluster require server restart.  When each 
server is restarted the next oldest node will be come master of the 
orphaned queues made by the restart. This will continue to happen until the 
last server is the master of all queues.  (ignoring newly created queues 
during maintenance.)  Idle queues with data would be left not synchronized 
and the data would only exist on the master server.  The problem now is the 
last server cannot be restarted or else the rest of the cluster will derive 
a new master for the queues and the idle data would be lost.  Active queues 
will eventually work out and become synchronized in the cluster.  Idle 
queues create a problem.

It appears server maintenance can't be scheduled but is at the mercy of the 
cluster's queues.  If there are idle queues best not perform maintenance 
when synchronized mirrored nodes is down to low "watermark"?  Stagger 
server maintenance in a way that there are frequent update servers and less 
update servers to balance out the cluster?

What is the best way to handle cluster server maintenance?   Is there a way 
to manage an idle queue in a cluster, possibly shuffle it between servers?

-Mark





-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20121030/77407882/attachment.htm>


More information about the rabbitmq-discuss mailing list