[rabbitmq-discuss] Queue disappears during partition/autoheal

Matt Pietrek mpietrek at skytap.com
Wed Apr 16 19:04:41 BST 2014


Thanks Simon. One more question:

> So this is something we've seen before in the case of short-lived
partitions; something in Mnesia is sending a stray {mnesia_locker, ...,
...} message to a process that isn't expecting it after the partition,
killing the process in question.

Do you have a sense whether this behavior is specific to having Autoheal
enabled? In other words, if we didn't have Autoheal in effect, could it
still happen?


On Wed, Apr 16, 2014 at 5:08 AM, Simon MacMullen <simon at rabbitmq.com> wrote:

> On 15/04/14 23:09, Matt Pietrek wrote:
>
>> This is rabbitmq 3.2.4, running in a 2 node cluster with all queues in ha.
>>
>
>  At some point we saw a network partition (see below). It appears that
>> Autoheal eventually worked, but afterwards the cmcmd queue wasn't on the
>> broker.
>>
>
>  =ERROR REPORT==== 14-Apr-2014::18:02:30 ===
>> ** Generic server <0.204.0> terminating
>> ** Last message in was {mnesia_locker,rabbit at sea5m1mq1,granted}
>> ** When Server state == {state,2,{from,<0.302.0>,#Ref<0.0.1372.163190>}}
>> ** Reason for termination ==
>> ** {unexpected_info,{mnesia_locker,rabbit at sea5m1mq1,granted}}
>>
>
> So this is something we've seen before in the case of short-lived
> partitions; something in Mnesia is sending a stray {mnesia_locker, ...,
> ...} message to a process that isn't expecting it after the partition,
> killing the process in question.
>
> The release notes for Erlang 17.0 contain:
>
> OTP-11497  To prevent a race condition if there is a short communication
>            problem when node-down and node-up events are received. They
>            are now stored and later checked if the node came up just
>            before mnesia flagged the node as down. (Thanks to Jonas
>            Falkevik )
>
> which sounds like the same thing.
>
> So it is quite possible that this is fixed in Erlang 17.0.
>
> Cheers, Simon
>
> --
> Simon MacMullen
> RabbitMQ, Pivotal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140416/6f1c914e/attachment.html>


More information about the rabbitmq-discuss mailing list