[rabbitmq-discuss] AutoHeal not working after yanking network cable
Chris
stuff at moesel.net
Thu Sep 5 20:23:14 BST 2013
Hi Simon,
Upgrading to 3.1.5 seems to have made things better. So either one of the
other bug fixes in 3.1.2 - 3.1.5 helped, or I was just unlucky those couple
of times we were trying it with 3.1.1. ;-)
Unfortunately I don't have the logs from those previous failed attempts.
Thanks for the response,
Chris
On Fri, Aug 30, 2013 at 12:53 PM, Simon MacMullen <simon at rabbitmq.com>wrote:
> There was definitely a bug in autoheal fixed in 3.1.1, but I'm not aware
> of anything since then. However it's possible some other bug that we have
> fixed is causing your problems with autoheal.
>
> So:
>
> 1) You might as well try 3.1.5.
> 2) Are there any crashes in the logs on the minority node?
>
> Cheers, Simon
>
>
> On 30/08/2013 4:26PM, Chris wrote:
>
>> Hi All,
>>
>> As part of our testing of failovers, we yank the network cable on a
>> machine (to simulate a switch going down). When we plug it back in,
>> RabbitMQ goes into the network partition mode. At first we were using
>> the default ('ignore') option for dealing with partitions, but it caused
>> problems.
>>
>> After that we put the nodes into 'autoheal' mode. This did not improve
>> things. Not only did the minority node not rejoin the partition, but it
>> refused to restart without manually killing the process. It also caused
>> problems on the other nodes (in the majority). They stopped accepting
>> connections and I couldn't even log into the web UI. So clearly,
>> 'autoheal' didn't seem to work as intended.
>>
>> We're using RabbitMQ 3.1.1. Is there anything fixed since then that
>> might help with our situation? Our end goal is to have everything
>> working again without intervention. I understand that this could cause
>> *some* data loss during the autoheal process, but this is probably OK.
>> We'd love just to get all three nodes happy again without having to
>> manually restart any nodes.
>>
>> Thanks,
>> Chris
>>
>>
>> ______________________________**_________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.**rabbitmq.com<rabbitmq-discuss at lists.rabbitmq.com>
>> https://lists.rabbitmq.com/**cgi-bin/mailman/listinfo/**rabbitmq-discuss<https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss>
>>
>>
> --
> Simon MacMullen
> RabbitMQ, Pivotal
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130905/c96fd2b9/attachment.htm>
More information about the rabbitmq-discuss
mailing list