[rabbitmq-discuss] Problems with the {cluster_partition_handling, pause_minority} option

Tue Mar 18 09:54:27 GMT 2014

Hi. Just to let you know that this has not been forgotten about. I have 
reproduced the problem you describe and am investigating it. The key 
factor seems to be (at a TCP level) that A realises it is disconnected 
from B but not vice versa. Therefore (at an Erlang level) the partition 
is not detected.

It's not specifically to do with pause_minority mode.

I'll post more information when I have it.

Cheers, Simon

On 05/03/2014 17:18, David Rodrigues wrote:
> Dear RabbitMQ Community,
>
> I'm having some problems with the {cluster_partition_handling,
> pause_minority} option and I would like to share with you my doubts.
>
> First, the architecture. I'm running RabbitMQ 3.2.4 (Erlang R14A) on 3
> nodes (rabbitmq at rabbitmq-test01, rabbitmq at rabbitmq-test02 and
> rabbitmq at rabbitmq-test03) on a virtualized platform - quite similar to
> EC2.  And because of that I have, from time to time, connectivity issues
> – I know, that's bad ;)
>
> Digging into the doc I have found that the best way to handle the
> problem is using the pause_minority option -
> https://www.rabbitmq.com/partitions.html.
>
> But from time to time my nodes get disconnected from each other and do
> not recover automatically. Hopefully I have managed to reproduce the
> problem. Here are the steps.
>
> THE CONFIGURATION FILE
> ***********************************************************
>
> My configuration file is quite simple:
>
> %% -*- mode: erlang -*-
> [
>   {rabbit,
>    [
>     {auth_mechanisms, ['PLAIN', 'AMQPLAIN']},
>     {default_vhost,       <<"/">>},
>     {default_user,        <<"admin">>},
>     {default_pass,        <<"admin">>},
>     {default_permissions, [<<".*">>, <<".*">>, <<".*">>]},
>     {default_user_tags, [administrator]},
>     {cluster_partition_handling, pause_minority},
>     {cluster_nodes, {['rabbitmq at rabbitmq-test01',
> 'rabbitmq at rabbitmq-test02', 'rabbitmq at rabbitmq-test03'], disc}}
>    ]},
>   {kernel, []},
>   {rabbitmq_management, []},
>   {rabbitmq_management_agent, []},
>   {rabbitmq_shovel,
>    [{shovels, []}
>    ]},
>   {rabbitmq_stomp, []},
>   {rabbitmq_mqtt, []},
>   {rabbitmq_amqp1_0, []},
>   {rabbitmq_auth_backend_ldap, []}
> ].
>
> Has you can see, the {cluster_partition_handling, pause_minority} option
> is there.
>
> PAUSE_MINORITY WORKING
> ***********************************************************
>
> When the network outage is long enough, the option works perfectly.
>
> To simulate a connection problem on rabbitmq-test03 I run:
>
> iptables -A INPUT -s rabbitmq-test01 -j DROP; iptables -A OUTPUT -d
> rabbitmq-test01 -j DROP
> iptables -A INPUT -s rabbitmq-test02 -j DROP; iptables -A OUTPUT -d
> rabbitmq-test02 -j DROP
>
> Then wait long enough for the following messages to appear in the logs
> of rabbitmq-test03 (approximately 180 seconds):
>
> =ERROR REPORT==== 5-Mar-2014::16:51:02 ===
> ** Node 'rabbitmq at rabbitmq-test02' not responding **
> ** Removing (timedout) connection **
> =ERROR REPORT==== 5-Mar-2014::16:51:02 ===
> ** Node 'rabbitmq at rabbitmq-test01' not responding **
> ** Removing (timedout) connection **
> =INFO REPORT==== 5-Mar-2014::16:51:02 ===
> rabbit on node 'rabbitmq at rabbitmq-test02' down
> =WARNING REPORT==== 5-Mar-2014::16:51:30 ===
> Cluster minority status detected - awaiting recovery
> =INFO REPORT==== 5-Mar-2014::16:51:30 ===
> rabbit on node 'rabbitmq at rabbitmq-test01' down
> =INFO REPORT==== 5-Mar-2014::16:51:30 ===
> Stopping RabbitMQ
> =INFO REPORT==== 5-Mar-2014::16:51:30 ===
> stopped TCP Listener on [::]:5672
> =WARNING REPORT==== 5-Mar-2014::16:51:58 ===
> Cluster minority status detected - awaiting recovery
>
> When flushing the rules (iptables -F) the connectivity is reestablished
> and the cluster works perfectly.
>
> In the logs:
>
> =INFO REPORT==== 5-Mar-2014::16:52:58 ===
> started TCP Listener on [::]:5672
> =INFO REPORT==== 5-Mar-2014::16:52:58 ===
> rabbit on node 'rabbitmq at rabbitmq-test01' up
> =INFO REPORT==== 5-Mar-2014::16:52:58 ===
> rabbit on node 'rabbitmq at rabbitmq-test02' up
>
> Finally the cluster status :
>
> Cluster status of node 'rabbitmq at rabbitmq-test03' ...
> [{nodes,[{disc,['rabbitmq at rabbitmq-test01','rabbitmq at rabbitmq-test02',
>                  'rabbitmq at rabbitmq-test03']}]},
>   {running_nodes,['rabbitmq at rabbitmq-test01','rabbitmq at rabbitmq-test02',
>                   'rabbitmq at rabbitmq-test03']},
>   {partitions,[]}]
> ...done.
>
> So far, so good. The option works flawlessly.
>
> PAUSE_MINORITY NOT WORKING
> ***********************************************************
>
> Life is not so bright when the network partition is not long enough.
>
> On rabbitmq-test03 I will run my iptables commands again:
>
> iptables -A INPUT -s rabbitmq-test01 -j DROP; iptables -A OUTPUT -d
> rabbitmq-test01 -j DROP
> iptables -A INPUT -s rabbitmq-test02 -j DROP; iptables -A OUTPUT -d
> rabbitmq-test02 -j DROP
>
> However this time I'll only wait 60 seconds before flushing my rules
> with iptables -F.
>
> And that's the result in rabbitmq-test03 logs:
>
> =INFO REPORT==== 5-Mar-2014::16:55:00 ===
> rabbit on node 'rabbitmq at rabbitmq-test02' down
> =ERROR REPORT==== 5-Mar-2014::16:55:00 ===
> Mnesia('rabbitmq at rabbitmq-test03'): ** ERROR ** mnesia_event got
> {inconsistent_database, running_partitioned_network,
> 'rabbitmq at rabbitmq-test02'}
> =INFO REPORT==== 5-Mar-2014::16:55:00 ===
> rabbit on node 'rabbitmq at rabbitmq-test01' down
> =INFO REPORT==== 5-Mar-2014::16:55:00 ===
> Statistics database started.
> =INFO REPORT==== 5-Mar-2014::16:55:00 ===
> Statistics database started.
>
> Again, the result is quiet ugly on rabbitmq-test01:
>
> =ERROR REPORT==== 5-Mar-2014::16:55:00 ===
> ** Node 'rabbitmq at rabbitmq-test03' not responding **
> ** Removing (timedout) connection **
> =INFO REPORT==== 5-Mar-2014::16:55:00 ===
> rabbit on node 'rabbitmq at rabbitmq-test03' down
> =INFO REPORT==== 5-Mar-2014::16:55:01 ===
> global: Name conflict terminating {rabbit_mgmt_db,<2669.1582.0>}
>
> Finally my cluster status:
>
> Cluster status of node 'rabbitmq at rabbitmq-test03' ...
> [{nodes,[{disc,['rabbitmq at rabbitmq-test01','rabbitmq at rabbitmq-test02',
>                  'rabbitmq at rabbitmq-test03']}]},
>   {running_nodes,['rabbitmq at rabbitmq-test03']},
>   {partitions,[{'rabbitmq at rabbitmq-test03',['rabbitmq at rabbitmq-test02']}]}]
> ...done.
>
> That's it. Even with the pause_minority option my cluster was disintegrated.
>
> SYNOPSIS
> ***********************************************************
>
> In short, if the network outage is long enough everything goes according
> to the plan and the cluster works perfectly once the connectivity is
> reestablished. However if the network outage has a intermediate duration
> (not too short, not too long) the pause_minority option seems not to work.
>
> Are you aware of this problem? Is there any solution to cope with this
> particular situation?
>
> Thanks,
> David
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>