<div dir="ltr">Hi all, <div><br></div><div>We observe some net split on our cluster and we don't know why.</div><div>Before change the net tick parameter and change the net split behavior, I want to understand why it's happening.</div>
<div>Our environment is : </div><div>RabbitMQ 3.2.1 Elrang R16B</div><div>3 RabbitMQ Node in the same sub-network</div><div>RabbitMQ is installed on Windows 2008 R2 (VMWare ESXi 5.1)</div><div>We have 4 Mirrored Queues on this cluster.</div>
<div>In production, the normal stream is about 20 messages/second.</div><div><br></div><div>We observe that split occurs always at the end of the snapshot (NetBackup) on the VM.</div><div>But, we made snapshot each night and the network split occurs 1 time each 15 or 20 days.</div>
<div><br></div><div><b><u>Log server rabbit@FRA-VSP-32545 :</u></b></div><div><div>=INFO REPORT==== 19-Feb-2014::18:34:47 ===</div><div>rabbit on node 'rabbit@FRA-VSP-32596' down</div><div><br></div><div>=INFO REPORT==== 19-Feb-2014::18:34:49 ===</div>
<div>Mirrored-queue (queue 'conso.queue.dead' in vhost '/IEC'): Slave <'rabbit@FRA-VSP-32545'.2.269.0> saw deaths of mirrors <'rabbit@FRA-VSP-32596'.1.270.0> </div></div><div><br>
</div><div><br></div><div><b><u>Log server rabbit@FRA-VSP-32596 :</u></b></div><div><div><br></div><div>=INFO REPORT==== 19-Feb-2014::18:34:28 ===</div><div>rabbit on node 'rabbit@FRA-VSP-32545' down</div><div><br>
</div><div>=ERROR REPORT==== 19-Feb-2014::18:34:30 ===</div><div>** Generic server <0.279.0> terminating</div><div>** Last message in was {'DOWN',#Ref<0.0.0.248452>,process,<5383.278.0>,</div><div>
noconnection}</div><div>** When Server state == {state,</div><div> {76,<0.279.0>},</div><div> {{79,<5383.278.0>},#Ref<0.0.0.248452>},</div>
<div> {{82,<5066.278.0>},#Ref<0.0.1.42330>},</div><div> {resource,<<"/IEC">>,queue,<<"conso.queue">>},</div><div>
rabbit_mirror_queue_coordinator,</div><div> {83,</div><div> [{{76,<0.279.0>},</div><div> {view_member,</div>
<div> {76,<0.279.0>},</div><div> [],</div><div> {79,<5383.278.0>},</div><div> {82,<5066.278.0>}}},</div>
<div> {{79,<5383.278.0>},</div><div> {view_member,</div><div> {79,<5383.278.0>},</div><div> [],</div>
<div> {82,<5066.278.0>},</div><div> {76,<0.279.0>}}},</div><div> {{82,<5066.278.0>},</div><div> {view_member,</div>
<div> {82,<5066.278.0>},</div><div> [],</div><div> {76,<0.279.0>},</div><div> {79,<5383.278.0>}}}]},</div>
<div> 1457518,</div><div> [{{76,<0.279.0>},{member,{[],[]},1457518,1457518}},</div><div> {{79,<5383.278.0>},{member,{[],[]},1,1}},</div>
<div> {{82,<5066.278.0>},{member,{[],[]},0,0}}],</div><div> [<0.1272.0>],</div><div> {[],[]},</div><div> [],undefined,</div>
<div> #Fun<rabbit_misc.execute_mnesia_transaction.1>}</div><div>** Reason for termination == </div><div>** {function_clause,</div><div> [{orddict,fetch,</div><div> [{76,<0.279.0>},</div>
<div> [{{82,<5066.278.0>},</div><div> {view_member,</div><div> {82,<5066.278.0>},</div><div> [{79,<5383.278.0>}],</div><div> {82,<5066.278.0>},</div>
<div> {82,<5066.278.0>}}}]],</div><div> [{file,"orddict.erl"},{line,72}]},</div><div> {gm,check_neighbours,1,[]},</div><div> {gm,handle_info,2,[]},</div><div> {gen_server2,handle_msg,2,[]},</div>
<div> {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}</div><div><br></div><div>=ERROR REPORT==== 19-Feb-2014::18:34:30 ===</div><div>** Generic server <0.283.0> terminating</div><div>** Last message in was {'DOWN',#Ref<0.0.0.248454>,process,<5383.282.0>,</div>
<div> noconnection}</div><div>** When Server state == {state,</div><div> {67,<0.283.0>},</div><div> {{70,<5383.282.0>},#Ref<0.0.0.248454>},</div>
<div> {{73,<5066.282.0>},#Ref<0.0.1.42352>},</div><div> {resource,<<"/IEC">>,queue,<<"event.queue">>},</div><div>
rabbit_mirror_queue_coordinator,</div><div> {74,</div><div> [{{67,<0.283.0>},</div><div> {view_member,</div>
<div> {67,<0.283.0>},</div><div> [],</div><div> {70,<5383.282.0>},</div><div> {73,<5066.282.0>}}},</div>
<div> {{70,<5383.282.0>},</div><div> {view_member,</div><div> {70,<5383.282.0>},</div><div> [],</div>
<div> {73,<5066.282.0>},</div><div> {67,<0.283.0>}}},</div><div> {{73,<5066.282.0>},</div><div> {view_member,</div>
<div> {73,<5066.282.0>},</div><div> [],</div><div> {67,<0.283.0>},</div><div> {70,<5383.282.0>}}}]},</div>
<div> 212075,</div><div> [{{67,<0.283.0>},{member,{[],[]},212075,212075}},</div><div> {{70,<5383.282.0>},{member,{[],[]},1,1}},</div>
<div> {{73,<5066.282.0>},{member,{[],[]},0,0}}],</div><div> [<0.1271.0>],</div><div> {[],[]},</div><div> [],undefined,</div>
<div> #Fun<rabbit_misc.execute_mnesia_transaction.1>}</div><div>** Reason for termination == </div><div>** {function_clause,</div><div> [{orddict,fetch,</div><div> [{67,<0.283.0>},</div>
<div> [{{73,<5066.282.0>},</div><div> {view_member,</div><div> {73,<5066.282.0>},</div><div> [{70,<5383.282.0>}],</div><div> {73,<5066.282.0>},</div>
<div> {73,<5066.282.0>}}}]],</div><div> [{file,"orddict.erl"},{line,72}]},</div><div> {gm,check_neighbours,1,[]},</div><div> {gm,handle_info,2,[]},</div><div> {gen_server2,handle_msg,2,[]},</div>
<div> {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}</div><div><br></div><div>=ERROR REPORT==== 19-Feb-2014::18:34:30 ===</div><div>** Generic server <0.203.0> terminating</div><div>** Last message in was {mnesia_tm,'rabbit@FRA-VSP-32545',</div>
<div> {vote_yes,{tid,10316,<0.203.0>}}}</div><div>** When Server state == 1</div><div>** Reason for termination == </div><div>** {unexpected_info,{mnesia_tm,'rabbit@FRA-VSP-32545',</div>
<div> {vote_yes,{tid,10316,<0.203.0>}}}}</div><div><br></div><div>=ERROR REPORT==== 19-Feb-2014::18:34:30 ===</div><div>** Generic server <0.275.0> terminating</div><div>** Last message in was {'DOWN',#Ref<0.0.1.38240>,process,<5383.274.0>,</div>
<div> noconnection}</div><div>** When Server state == {state,</div><div> {70,<0.275.0>},</div><div> {{76,<5066.274.0>},#Ref<0.0.1.42305>},</div>
<div> {{73,<5383.274.0>},#Ref<0.0.1.38240>},</div><div> {resource,<<"/IEC">>,queue,</div><div> <<"activity.queue.dead">>},</div>
<div> rabbit_mirror_queue_coordinator,</div><div> {77,</div><div> [{{70,<0.275.0>},</div><div> {view_member,</div>
<div> {70,<0.275.0>},</div><div> [],</div><div> {76,<5066.274.0>},</div><div> {73,<5383.274.0>}}},</div>
<div> {{73,<5383.274.0>},</div><div> {view_member,</div><div> {73,<5383.274.0>},</div><div> [],</div>
<div> {70,<0.275.0>},</div><div> {76,<5066.274.0>}}},</div><div> {{76,<5066.274.0>},</div><div> {view_member,</div>
<div> {76,<5066.274.0>},</div><div> [],</div><div> {73,<5383.274.0>},</div><div> {70,<0.275.0>}}}]},</div>
<div> 6,</div><div> [{{70,<0.275.0>},{member,{[],[]},6,6}},</div><div> {{73,<5383.274.0>},{member,{[],[]},1,1}},</div><div> {{76,<5066.274.0>},{member,{[],[]},0,0}}],</div>
<div> [<0.1273.0>],</div><div> {[],[]},</div><div> [],undefined,</div><div> #Fun<rabbit_misc.execute_mnesia_transaction.1>}</div>
<div>** Reason for termination == </div><div>** {noproc,{gen_server2,call,</div><div> [<0.203.0>,</div><div> {submit,#Fun<rabbit_misc.6.116010224>},</div><div> infinity]}}</div>
<div><br></div><div>=ERROR REPORT==== 19-Feb-2014::18:34:30 ===</div><div>** Generic server <0.204.0> terminating</div><div>** Last message in was {mnesia_tm,'rabbit@FRA-VSP-32545',</div><div> {vote_yes,{tid,10315,<0.204.0>}}}</div>
<div>** When Server state == 2</div><div>** Reason for termination == </div><div>** {unexpected_info,{mnesia_tm,'rabbit@FRA-VSP-32545',</div><div> {vote_yes,{tid,10315,<0.204.0>}}}}</div>
<div><br></div><div>=ERROR REPORT==== 19-Feb-2014::18:34:30 ===</div><div>** Generic server <0.1268.0> terminating</div><div>** Last message in was {'$gen_cast',{gm_deaths,[<5066.266.0>,<0.267.0>]}}</div>
<div>** When Server state == {state,</div><div> {amqqueue,</div><div> {resource,<<"/IEC">>,queue,</div><div> <<"gps.queue.dead">>},</div>
<div> true,false,none,[],<0.266.0>,</div><div> [<5066.265.0>],</div><div> [<5066.265.0>],</div><div> [{vhost,<<"/IEC">>},</div>
<div> {name,<<"Queue HA">>},</div><div> {pattern,<<".queue">>},</div><div> {'apply-to',<<"queues">>},</div>
<div> {definition,</div><div> [{<<"ha-mode">>,<<"all">>},</div><div> {<<"ha-sync-mode">>,<<"automatic">>}]},</div>
<div> {priority,0}],</div><div> [{<5066.266.0>,<5066.265.0>},</div><div> {<5383.266.0>,<5383.265.0>}],</div>
<div> []},</div><div> <0.267.0>,</div><div> {state,</div><div> {dict,0,16,16,8,80,48,</div><div> {[],[],[],[],[],[],[],[],[],[],[],[],[],</div>
<div> [],[],[]},</div><div> {{[],[],[],[],[],[],[],[],[],[],[],[],[],</div><div> [],[],[]}}},</div><div> erlang},</div>
<div> #Fun<rabbit_mirror_queue_master.5.69128381>,</div><div> #Fun<rabbit_mirror_queue_master.6.50493311>}</div><div>** Reason for termination == </div><div>
** {{case_clause,{ok,<5066.265.0>,[]}},</div><div> [{rabbit_mirror_queue_coordinator,handle_cast,2,[]},</div><div> {gen_server2,handle_msg,2,[]},</div><div> {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}</div>
<div><br></div><div>=ERROR REPORT==== 19-Feb-2014::18:34:31 ===</div><div>** Generic server <0.266.0> terminating</div><div>** Last message in was {'EXIT',<0.1268.0>,</div><div> {{case_clause,{ok,<5066.265.0>,[]}},</div>
<div> [{rabbit_mirror_queue_coordinator,handle_cast,2,</div><div> []},</div><div> {gen_server2,handle_msg,2,[]},</div><div> {proc_lib,wake_up,3,</div>
<div> [{file,"proc_lib.erl"},{line,249}]}]}}</div><div>** When Server state == {q,</div><div> {amqqueue,</div><div> {resource,<<"/IEC">>,queue,<<"gps.queue.dead">>},</div>
<div> true,false,none,[],<0.266.0>,</div><div> [<5383.265.0>,<5066.265.0>],</div><div> [<5066.265.0>,<5383.265.0>],</div>
<div> [{vhost,<<"/IEC">>},</div><div> {name,<<"Queue HA">>},</div><div> {pattern,<<".queue">>},</div>
<div> {'apply-to',<<"queues">>},</div><div> {definition,</div><div> [{<<"ha-mode">>,<<"all">>},</div>
<div> {<<"ha-sync-mode">>,<<"automatic">>}]},</div><div> {priority,0}],</div><div> [{<5066.266.0>,<5066.265.0>},</div>
<div> {<5383.266.0>,<5383.265.0>},</div><div> {<0.267.0>,<0.266.0>}],</div><div> []},</div><div> none,false,rabbit_mirror_queue_master,</div>
<div> {state,</div><div> {resource,<<"/IEC">>,queue,<<"gps.queue.dead">>},</div><div> <0.267.0>,<0.1268.0>,rabbit_variable_queue,</div>
<div> {vqstate,</div><div> {0,{[],[]}},</div><div> {0,{[],[]}},</div><div> {delta,undefined,0,undefined},</div><div> {0,{[],[]}},</div>
<div> {0,{[],[]}},</div><div> 0,</div><div> {0,nil},</div><div> {0,nil},</div><div> {qistate,</div>
<div> "d:/tools/RabbitMQ Server/data/db/rabbit@FRA-VSP-32596-mnesia/queues/6IXYXKMC8M51EEAXH5MKLR0Q4",</div><div> {{dict,0,16,16,8,80,48,</div><div> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div>
<div> []},</div><div> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div><div> []}}},</div><div> []},</div>
<div> undefined,0,65536,</div><div> #Fun<rabbit_variable_queue.2.81334491>,</div><div> {0,nil}},</div><div> {{client_msstate,msg_store_persistent,</div>
<div> <<55,209,140,132,77,86,75,214,37,255,72,56,103,92,</div><div> 154,75>>,</div><div> {dict,0,16,16,8,80,48,</div><div>
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div>
<div> []},</div><div> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div><div> []}}},</div><div> {state,340043,</div>
<div> "d:/tools/RabbitMQ Server/data/db/rabbit@FRA-VSP-32596-mnesia/msg_store_persistent"},</div><div> rabbit_msg_store_ets_index,</div><div> "d:/tools/RabbitMQ Server/data/db/rabbit@FRA-VSP-32596-mnesia/msg_store_persistent",</div>
<div> <0.255.0>,344140,335946,348237,352334},</div><div> {client_msstate,msg_store_transient,</div><div> <<148,176,200,245,252,25,203,27,190,186,25,104,</div>
<div> 217,230,131,35>>,</div><div> {dict,0,16,16,8,80,48,</div><div> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div><div> []},</div>
<div> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div><div> []}}},</div><div> {state,319558,</div><div> "d:/tools/RabbitMQ Server/data/db/rabbit@FRA-VSP-32596-mnesia/msg_store_transient"},</div>
<div> rabbit_msg_store_ets_index,</div><div> "d:/tools/RabbitMQ Server/data/db/rabbit@FRA-VSP-32596-mnesia/msg_store_transient",</div><div> <0.250.0>,323655,315461,327752,331849}},</div>
<div> true,0,0,0,infinity,0,0,0,0,0,</div><div> {rates,</div><div> {{1392,831016,530070},0},</div><div> {{1392,831016,530070},0},</div>
<div> 0.0,0.0,</div><div> {1392,831128,748070}},</div><div> {0,nil},</div><div> {0,nil},</div><div> {0,nil},</div>
<div> {0,nil},</div><div> 0,0,</div><div> {rates,</div><div> {{1392,831016,530070},0},</div><div> {{1392,831016,530070},0},</div>
<div> 0.0,0.0,</div><div> {1392,831128,748070}}},</div><div> {dict,0,16,16,8,80,48,</div><div> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},</div>
<div> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div><div> []}}},</div><div> [],</div><div> {set,0,16,16,8,80,48,</div>
<div> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},</div><div> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div><div> []}}}},</div>
<div>
{queue,[],[],0},</div><div> undefined,undefined,undefined,undefined,</div><div> {state,fine,5000,undefined},</div><div> {0,nil},</div>
<div> undefined,undefined,undefined,</div><div> {state,</div><div> {dict,0,16,16,8,80,48,</div><div> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},</div>
<div> {{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],</div><div> []}}},</div><div> delegate},</div><div> undefined,undefined,undefined,4,running}</div>
<div>** Reason for termination == </div><div>** {{case_clause,{ok,<5066.265.0>,[]}},</div><div> [{rabbit_mirror_queue_coordinator,handle_cast,2,[]},</div><div> {gen_server2,handle_msg,2,[]},</div><div> {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}</div>
</div><div><br></div><div><br></div><div>Any ideas?</div><div><br></div><div>Best regards, <br clear="all"><div><div><span style="font-family:arial,sans-serif;font-size:13px;border-collapse:collapse"><strong><p style="margin:0px;line-height:12pt;display:inline!important">
<strong><span style="font-size:8pt;font-family:Verdana;color:rgb(95,95,95)">________________________________________________________________</span></strong></p></strong></span><div><div><span style="font-size:x-small"><b><font color="#666666">Michaël OULLION</font></b></span></div>
<div><font color="#666666" face="verdana, sans-serif" size="1"><i>Architecte JAVA</i></font></div><div><b><font color="#FF0000"><font face="verdana, sans-serif"><img src="http://intranet2008.norbert-dentressangle.com/Intranet/Portail.nsf/3739e8870e42c9a5c12572dd004ebe9c/8cc1800a2b414d59c12573c6003af585/Contenu/0.8A6?OpenElement&FieldElemFormat=gif" width="96" height="26"></font></font></b></div>
<div><span style="font-family:verdana,sans-serif"><font color="#FF0000"><b>ND Informatique</b></font></span></div><div><span style="font-family:verdana,sans-serif"><font color="#FF0000"><b></b></font></span><span style="color:rgb(102,102,102);font-family:verdana,sans-serif;font-size:x-small">Adresse (1208 route des Pierrelles B.P. 98 BEAUSEMBLANT - 26240 Beausemblant - FRANCE)</span></div>
<div><font face="verdana, sans-serif"><span style="font-size:x-small"><font color="#666666">Tel. +33 (0)4 75 23 68 07</font></span></font></div><div><font face="verdana, sans-serif"><span style="font-size:x-small"><font color="#666666">Visit our web site at <a href="http://www.norbert-dentressangle.com/" target="_blank">www.norbert-dentressangle.com</a></font></span></font></div>
</div></div><div><font face="verdana, sans-serif"><span style="font-size:x-small"><br></span></font></div></div>
</div></div>