[rabbitmq-discuss] autoheal behavior in the presence of HA-queues mastered on multiple nodes?

Matt Pietrek mpietrek at skytap.com
Mon Feb 24 16:26:04 GMT 2014


Thanks Simon. A bit after I sent my messages I came to the same conclusion.
It makes me very happy that you confirmed my reasoning.

Matt



On Mon, Feb 24, 2014 at 2:57 AM, Simon MacMullen <simon at rabbitmq.com> wrote:

> On 21/02/2014 11:49PM, Matt Pietrek wrote:
>
>> Picture a two node cluster with nodes A and B, using HA-queues and
>> autoheal. Some queues are mastered on A, and others on B. There's a VIP
>> in front of the cluster that points to one of the brokers.
>>
>
> OK. Looking at how things evolve step by step:
>
>
>  Now imagine a network event occurs and the cluster splits.
>>
>
> At this point the network is partitioned: both side of the cluster are
> running separately, each believing that the other has gone down. So on A,
> all the queues that had masters on B fail over to A, and vice versa on B.
>
> So this is the nub of why network partitions are a big deal; the two sides
> of the cluster both think they are authoritative and both start to evolve
> separately.
>
>  Autoheal kicks in,
>>
>
> Of course autoheal will not kick in until the underlying network partition
> is resolved; until the two sides can see each other they will not have a
> clue anything has gone wrong (well, more wrong than node failure).
>
>
>  selects a loser (say, B), and restarts it to rejoin the
>> cluster. What happens to the data in queues that were mastered on B when
>> B restarts?
>>
>
> Hopefully at this point you have your answer: since A is the winning side,
> the queue state from A overwrites the queue state from B. And the queue
> state from A will be whatever the state was at the time of the split (even
> for queues mastered on B), with whatever changes A has seen since.
>
> Now, since you said earlier "there is a VIP pointing at A" then maybe no
> changes have been taking place on B anyway. But if there were any changes
> on B, you lost them.
>
> Partitions are a big deal, and autoheal is not a panacea.
>
>
>  And does it matter if the messages in the queue were
>> persistent or not?
>>
>
> No.
>
> Cheers, Simon
>
> --
> Simon MacMullen
> RabbitMQ, Pivotal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140224/6aeea1e2/attachment.html>


More information about the rabbitmq-discuss mailing list