[rabbitmq-discuss] AutoHeal not working after yanking network cable

Chris stuff at moesel.net
Thu Sep 5 20:23:14 BST 2013


Hi Simon,

Upgrading to 3.1.5 seems to have made things better.  So either one of the
other bug fixes in 3.1.2 - 3.1.5 helped, or I was just unlucky those couple
of times we were trying it with 3.1.1. ;-)

Unfortunately I don't have the logs from those previous failed attempts.

Thanks for the response,
Chris


On Fri, Aug 30, 2013 at 12:53 PM, Simon MacMullen <simon at rabbitmq.com>wrote:

> There was definitely a bug in autoheal fixed in 3.1.1, but I'm not aware
> of anything since then. However it's possible some other bug that we have
> fixed is causing your problems with autoheal.
>
> So:
>
> 1) You might as well try 3.1.5.
> 2) Are there any crashes in the logs on the minority node?
>
> Cheers, Simon
>
>
> On 30/08/2013 4:26PM, Chris wrote:
>
>> Hi All,
>>
>> As part of our testing of failovers, we yank the network cable on a
>> machine (to simulate a switch going down).  When we plug it back in,
>> RabbitMQ goes into the network partition mode.  At first we were using
>> the default ('ignore') option for dealing with partitions, but it caused
>> problems.
>>
>> After that we put the nodes into 'autoheal' mode.  This did not improve
>> things.  Not only did the minority node not rejoin the partition, but it
>> refused to restart without manually killing the process.  It also caused
>> problems on the other nodes (in the majority).  They stopped accepting
>> connections and I couldn't even log into the web UI.  So clearly,
>> 'autoheal' didn't seem to work as intended.
>>
>> We're using RabbitMQ 3.1.1.  Is there anything fixed since then that
>> might help with our situation?  Our end goal is to have everything
>> working again without intervention.  I understand that this could cause
>> *some* data loss during the autoheal process, but this is probably OK.
>>   We'd love just to get all three nodes happy again without having to
>> manually restart any nodes.
>>
>> Thanks,
>> Chris
>>
>>
>> ______________________________**_________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> https://lists.rabbitmq.com/**cgi-bin/mailman/listinfo/**rabbitmq-discuss<https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>>
>>
> --
> Simon MacMullen
> RabbitMQ, Pivotal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130905/c96fd2b9/attachment.htm>


More information about the rabbitmq-discuss mailing list