[rabbitmq-discuss] Queue data recovery after master failure

Fri Jun 20 14:20:21 BST 2014

On 18/06/14 09:42, Simon MacMullen wrote:
> I'm afraid not. Really "rabbitmqctl forget_cluster_node" should be able
> to cause down slaves to come back as new masters, which would be the
> right solution to this. I'm hoping that we'll be able to do that for
> 3.4.0, but it's a somewhat intrusive change. The bug number for this
> branch will be 26191, so you can keep an eye on it in future (currently
> there's nothing there).

To follow on:

This turns out to be easy if we can assume that there will be a slave to 
promote that is also down (quite likely since you tend to encounter this 
while your cluster is down), so I've done that as a first pass at the 
problem. See:

http://next.rabbitmq.com/ha.html#promotion-while-down

for documentation of how it will be in tonight's nightly build.

26191 is still reserved for the more thorny case of how to do this when 
the rest of the cluster has come back up - there are a lot of issues 
there so it might not happen soon.

Cheers, Simon

-- 
Simon MacMullen
RabbitMQ, Pivotal