[rabbitmq-discuss] Network partitioning

Amol Kedar ajkedar at gmail.com
Wed Apr 2 05:41:28 BST 2014


were you using the federation plugin?

On Thursday, March 20, 2014 5:41:22 AM UTC-4, Geoffrey Samper wrote:
>
> Hi, 
>
>
> We have a cluster with 3 nodes (Rabbit 1, Rabbit 3, rabbit 4). Sometimes 
> we have problems of network partitioning. Over the day everything runs fine 
> but at night 3AM and 4AM arround we expierence network partitioning. Most 
> of the time between 3 and 1. What strange is the log of rabbit 1 is 
> contains  "Generic server <0.263.0> terminating in the log". 
>
> Can anyone help me with this issue. The logs are below
>
>
> RABBIT 04
>
> ===========
>
> INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.PageDeletedConsumer' in vhost 'A105'): Master 
> <rabbit at P04.1.264.0> saw deaths of mirrors <rabbit at P01.3.264.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.services_p14' in vhost 'A105'): Master 
> <rabbit at P04.1.262.0> saw deaths of mirrors <rabbit at P01.3.262.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.MNT_grbhstmntp01' in vhost 'A105'): Master 
> <rabbit at P04.1.292.0> saw deaths of mirrors <rabbit at P01.3.292.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.services_p05' in vhost 'A105'): Slave 
> <rabbit at P04.1.24486.0> saw deaths of mirrors <rabbit at P01.3.30101.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.services_p09' in vhost 'A105'): Master 
> <rabbit at P04.1.24669.0> saw deaths of mirrors <rabbit at P01.3.30387.0>
>
> Rabbit 03
> =====================
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> rabbit on node rabbit at P01 down
>
> =ERROR REPORT==== 20-Mar-2014::03:41:27 ===
> Mnesia(rabbit at P03): ** ERROR ** mnesia_event got {inconsistent_database, 
> running_partitioned_network, rabbit at P01}
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.services_p21' in vhost 'A105'): Slave 
> <rabbit at P03.2.278.0> saw deaths of mirrors <rabbit at P01.3.275.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.services_p22' in vhost 'A105'): Slave 
> <rabbit at P03.2.270.0> saw deaths of mirrors <rabbit at P01.3.268.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
>
> Mirrored-queue (queue 'ha.PageDeletedConsumer' in vhost 'A105.001'): Slave 
> <rabbit at P03.2.276.0> saw deaths of mirrors <rabbit at P01.3.274.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.services_p04' in vhost 'A105.001'): Slave 
> <rabbit at P03.2.274.0> saw deaths of mirrors <rabbit at P01.3.271.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.MNT_grbhstmntp02' in vhost 'A105'): Slave 
> <rabbit at P03.2.282.0> saw deaths of mirrors <rabbit at P01.3.280.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.services_p03' in vhost 'A105.001'): Slave 
> <rabbit at P03.2.280.0> saw deaths of mirrors <rabbit at P01.3.278.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.MNT_grbhstmntp01' in vhost 'A105.001'): Slave 
> <rabbit at P03.2.286.0> saw deaths of mirrors <rabbit at P01.3.284.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.services_p20' in vhost 'A105.001'): Slave 
> <rabbit at P03.2.272.0> saw deaths of mirrors <rabbit at P01.3.270.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:27 ===
> Mirrored-queue (queue 'ha.PageDeleteConsumer' in vhost 'A105.001'): Slave 
> <rabbit at P03.2.284.0> saw deaths of mirrors <rabbit at P01.3.282.0>
>
> Rabbit 01
> ===================================
>
> =INFO REPORT==== 20-Mar-2014::03:41:14 ===
> rabbit on node rabbit at P03 down
>
> =ERROR REPORT==== 20-Mar-2014::03:41:15 ===
> Mnesia(rabbit at P01): ** ERROR ** mnesia_event got {inconsistent_database, 
> running_partitioned_network, rabbit at P03}
>
>
> =ERROR REPORT==== 20-Mar-2014::03:41:15 ===
>
> ** Generic server <0.263.0> terminating
>
> ** Last message in was {'DOWN',#Ref<0.0.0.1692>,process,<4625.265.0>,
>
>                                noconnection}
>
> ** When Server state == {state,
>
>                             {31,<0.263.0>},
>
>                             {{28,<4625.265.0>},#Ref<0.0.0.1692>},
>
>                             {{25,<5158.263.0>},#Ref<0.0.0.1693>},
>
>                             {resource,<<"A105">>,queue,
>
>                                 <<"ha.services_p14">>},
>
>                             rabbit_mirror_queue_slave,
>
>                             {32,
>
>                              [{{25,<5158.263.0>},
>
>                                {view_member,
>
>                                    {25,<5158.263.0>},
>
>                                    [],
>
>                                    {31,<0.263.0>},
>
>                                    {28,<4625.265.0>}}},
>
>                               {{28,<4625.265.0>},
>
>                                {view_member,
>
>                                    {28,<4625.265.0>},
>
>                                    [],
>
>                                    {25,<5158.263.0>},
>
>                                    {31,<0.263.0>}}},
>
>                               {{31,<0.263.0>},
>
>                                {view_member,
>
>                                    {31,<0.263.0>},
>
>                                    [],
>
>                                    {28,<4625.265.0>},
>
>                                    {25,<5158.263.0>}}}]},
>
>                             0,
>
>                             [{{25,<5158.263.0>},{member,{[],[]},6,6}},
>
>                              {{28,<4625.265.0>},{member,{[],[]},1,1}},
>
>                              {{31,<0.263.0>},{member,{[],[]},0,0}}],
>
>                             [<0.262.0>],
>
>                             {[],[]},
>
>                             [],0,undefined,
>
>                             #Fun<rabbit_misc.execute_mnesia_transaction.1>}
>
> ** Reason for termination == 
>
> ** {function_clause,[{orddict,fetch,
>
>                               [{31,<0.263.0>},[]],
>
>                               [{file,"orddict.erl"},{line,72}]},
>
>                      {gm,check_neighbours,1,[]},
>
>                      {gm,handle_info,2,[]},
>
>                      {gen_server2,handle_msg,2,[]},
>
>                     
>  {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}
> =INFO REPORT==== 20-Mar-2014::03:41:15 ===
>
> Mirrored-queue (queue 'ha.MNT_Pages_grbhstmntp01' in vhost 'A105'): Slave 
> <rabbit at P01.3.289.0> saw deaths of mirrors <rabbit at P01.3.289.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:15 ===
> Mirrored-queue (queue 'ha.services_p17' in vhost 'A105'): Slave 
> <rabbit at P01.3.266.0> saw deaths of mirrors <rabbit at P03.2.268.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:15 ===
> Mirrored-queue (queue 'ha.services_p06' in vhost 'A105'): Slave 
> <rabbit at P01.3.30366.0> saw deaths of mirrors <rabbit at P03.2.21958.0>
>
> =INFO REPORT==== 20-Mar-2014::03:41:15 ===
> Mirrored-queue (queue 'ha.services_p06' in vhost 'A105'): Promoting slave 
> <rabbit at P01.3.30366.0> to master
>
> =INFO REPORT==== 20-Mar-2014::03:41:16 ===
> Mirrored-queue (queue 'ha.ElasticSearchConsumer' in vhost 'A105.001'): 
> Slave <rabbit at P01.3.286.0> saw deaths of mirrors <rabbit at P03.2.288.0> 
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140401/9ac640f2/attachment.html>


More information about the rabbitmq-discuss mailing list