[rabbitmq-discuss] Network partitioning

Geoffrey Samper geoffrey.samper at gmail.com
Thu Mar 20 09:41:22 GMT 2014


Hi, 


We have a cluster with 3 nodes (Rabbit 1, Rabbit 3, rabbit 4). Sometimes we 
have problems of network partitioning. Over the day everything runs fine 
but at night 3AM and 4AM arround we expierence network partitioning. Most 
of the time between 3 and 1. What strange is the log of rabbit 1 is 
contains  "Generic server <0.263.0> terminating in the log". 

Can anyone help me with this issue. The logs are below


RABBIT 04

===========

INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.PageDeletedConsumer' in vhost 'A105'): Master 
<rabbit at P04.1.264.0> saw deaths of mirrors <rabbit at P01.3.264.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.services_p14' in vhost 'A105'): Master 
<rabbit at P04.1.262.0> saw deaths of mirrors <rabbit at P01.3.262.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.MNT_grbhstmntp01' in vhost 'A105'): Master 
<rabbit at P04.1.292.0> saw deaths of mirrors <rabbit at P01.3.292.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.services_p05' in vhost 'A105'): Slave 
<rabbit at P04.1.24486.0> saw deaths of mirrors <rabbit at P01.3.30101.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.services_p09' in vhost 'A105'): Master 
<rabbit at P04.1.24669.0> saw deaths of mirrors <rabbit at P01.3.30387.0>

Rabbit 03
=====================
=INFO REPORT==== 20-Mar-2014::03:41:27 ===
rabbit on node rabbit at P01 down

=ERROR REPORT==== 20-Mar-2014::03:41:27 ===
Mnesia(rabbit at P03): ** ERROR ** mnesia_event got {inconsistent_database, 
running_partitioned_network, rabbit at P01}

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.services_p21' in vhost 'A105'): Slave 
<rabbit at P03.2.278.0> saw deaths of mirrors <rabbit at P01.3.275.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.services_p22' in vhost 'A105'): Slave 
<rabbit at P03.2.270.0> saw deaths of mirrors <rabbit at P01.3.268.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===

Mirrored-queue (queue 'ha.PageDeletedConsumer' in vhost 'A105.001'): Slave 
<rabbit at P03.2.276.0> saw deaths of mirrors <rabbit at P01.3.274.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.services_p04' in vhost 'A105.001'): Slave 
<rabbit at P03.2.274.0> saw deaths of mirrors <rabbit at P01.3.271.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.MNT_grbhstmntp02' in vhost 'A105'): Slave 
<rabbit at P03.2.282.0> saw deaths of mirrors <rabbit at P01.3.280.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.services_p03' in vhost 'A105.001'): Slave 
<rabbit at P03.2.280.0> saw deaths of mirrors <rabbit at P01.3.278.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.MNT_grbhstmntp01' in vhost 'A105.001'): Slave 
<rabbit at P03.2.286.0> saw deaths of mirrors <rabbit at P01.3.284.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.services_p20' in vhost 'A105.001'): Slave 
<rabbit at P03.2.272.0> saw deaths of mirrors <rabbit at P01.3.270.0>

=INFO REPORT==== 20-Mar-2014::03:41:27 ===
Mirrored-queue (queue 'ha.PageDeleteConsumer' in vhost 'A105.001'): Slave 
<rabbit at P03.2.284.0> saw deaths of mirrors <rabbit at P01.3.282.0>

Rabbit 01
===================================

=INFO REPORT==== 20-Mar-2014::03:41:14 ===
rabbit on node rabbit at P03 down

=ERROR REPORT==== 20-Mar-2014::03:41:15 ===
Mnesia(rabbit at P01): ** ERROR ** mnesia_event got {inconsistent_database, 
running_partitioned_network, rabbit at P03}


=ERROR REPORT==== 20-Mar-2014::03:41:15 ===

** Generic server <0.263.0> terminating

** Last message in was {'DOWN',#Ref<0.0.0.1692>,process,<4625.265.0>,

                               noconnection}

** When Server state == {state,

                            {31,<0.263.0>},

                            {{28,<4625.265.0>},#Ref<0.0.0.1692>},

                            {{25,<5158.263.0>},#Ref<0.0.0.1693>},

                            {resource,<<"A105">>,queue,

                                <<"ha.services_p14">>},

                            rabbit_mirror_queue_slave,

                            {32,

                             [{{25,<5158.263.0>},

                               {view_member,

                                   {25,<5158.263.0>},

                                   [],

                                   {31,<0.263.0>},

                                   {28,<4625.265.0>}}},

                              {{28,<4625.265.0>},

                               {view_member,

                                   {28,<4625.265.0>},

                                   [],

                                   {25,<5158.263.0>},

                                   {31,<0.263.0>}}},

                              {{31,<0.263.0>},

                               {view_member,

                                   {31,<0.263.0>},

                                   [],

                                   {28,<4625.265.0>},

                                   {25,<5158.263.0>}}}]},

                            0,

                            [{{25,<5158.263.0>},{member,{[],[]},6,6}},

                             {{28,<4625.265.0>},{member,{[],[]},1,1}},

                             {{31,<0.263.0>},{member,{[],[]},0,0}}],

                            [<0.262.0>],

                            {[],[]},

                            [],0,undefined,

                            #Fun<rabbit_misc.execute_mnesia_transaction.1>}

** Reason for termination == 

** {function_clause,[{orddict,fetch,

                              [{31,<0.263.0>},[]],

                              [{file,"orddict.erl"},{line,72}]},

                     {gm,check_neighbours,1,[]},

                     {gm,handle_info,2,[]},

                     {gen_server2,handle_msg,2,[]},

                    
 {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}
=INFO REPORT==== 20-Mar-2014::03:41:15 ===

Mirrored-queue (queue 'ha.MNT_Pages_grbhstmntp01' in vhost 'A105'): Slave 
<rabbit at P01.3.289.0> saw deaths of mirrors <rabbit at P01.3.289.0>

=INFO REPORT==== 20-Mar-2014::03:41:15 ===
Mirrored-queue (queue 'ha.services_p17' in vhost 'A105'): Slave 
<rabbit at P01.3.266.0> saw deaths of mirrors <rabbit at P03.2.268.0>

=INFO REPORT==== 20-Mar-2014::03:41:15 ===
Mirrored-queue (queue 'ha.services_p06' in vhost 'A105'): Slave 
<rabbit at P01.3.30366.0> saw deaths of mirrors <rabbit at P03.2.21958.0>

=INFO REPORT==== 20-Mar-2014::03:41:15 ===
Mirrored-queue (queue 'ha.services_p06' in vhost 'A105'): Promoting slave 
<rabbit at P01.3.30366.0> to master

=INFO REPORT==== 20-Mar-2014::03:41:16 ===
Mirrored-queue (queue 'ha.ElasticSearchConsumer' in vhost 'A105.001'): 
Slave <rabbit at P01.3.286.0> saw deaths of mirrors <rabbit at P03.2.288.0> 
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140320/e13d3687/attachment.html>


More information about the rabbitmq-discuss mailing list