[rabbitmq-discuss] RabbitMQ Cluster, split network & VMWare snapshot
Michael Oullion
michael.oullion at norbert-dentressangle.com
Thu Feb 20 11:12:03 GMT 2014
Hi all,
We observe some net split on our cluster and we don't know why.
Before change the net tick parameter and change the net split behavior, I
want to understand why it's happening.
Our environment is :
RabbitMQ 3.2.1 Elrang R16B
3 RabbitMQ Node in the same sub-network
RabbitMQ is installed on Windows 2008 R2 (VMWare ESXi 5.1)
We have 4 Mirrored Queues on this cluster.
In production, the normal stream is about 20 messages/second.
We observe that split occurs always at the end of the snapshot (NetBackup)
on the VM.
But, we made snapshot each night and the network split occurs 1 time each
15 or 20 days.
*Log server rabbit at FRA-VSP-32545 :*
=INFO REPORT==== 19-Feb-2014::18:34:47 ===
rabbit on node 'rabbit at FRA-VSP-32596' down
=INFO REPORT==== 19-Feb-2014::18:34:49 ===
Mirrored-queue (queue 'conso.queue.dead' in vhost '/IEC'): Slave
<'rabbit at FRA-VSP-32545'.2.269.0> saw deaths of mirrors
<'rabbit at FRA-VSP-32596'.1.270.0>
*Log server rabbit at FRA-VSP-32596 :*
=INFO REPORT==== 19-Feb-2014::18:34:28 ===
rabbit on node 'rabbit at FRA-VSP-32545' down
=ERROR REPORT==== 19-Feb-2014::18:34:30 ===
** Generic server <0.279.0> terminating
** Last message in was {'DOWN',#Ref<0.0.0.248452>,process,<5383.278.0>,
noconnection}
** When Server state == {state,
{76,<0.279.0>},
{{79,<5383.278.0>},#Ref<0.0.0.248452>},
{{82,<5066.278.0>},#Ref<0.0.1.42330>},
{resource,<<"/IEC">>,queue,<<"conso.queue">>},
rabbit_mirror_queue_coordinator,
{83,
[{{76,<0.279.0>},
{view_member,
{76,<0.279.0>},
[],
{79,<5383.278.0>},
{82,<5066.278.0>}}},
{{79,<5383.278.0>},
{view_member,
{79,<5383.278.0>},
[],
{82,<5066.278.0>},
{76,<0.279.0>}}},
{{82,<5066.278.0>},
{view_member,
{82,<5066.278.0>},
[],
{76,<0.279.0>},
{79,<5383.278.0>}}}]},
1457518,
[{{76,<0.279.0>},{member,{[],[]},1457518,1457518}},
{{79,<5383.278.0>},{member,{[],[]},1,1}},
{{82,<5066.278.0>},{member,{[],[]},0,0}}],
[<0.1272.0>],
{[],[]},
[],undefined,
#Fun<rabbit_misc.execute_mnesia_transaction.1>}
** Reason for termination ==
** {function_clause,
[{orddict,fetch,
[{76,<0.279.0>},
[{{82,<5066.278.0>},
{view_member,
{82,<5066.278.0>},
[{79,<5383.278.0>}],
{82,<5066.278.0>},
{82,<5066.278.0>}}}]],
[{file,"orddict.erl"},{line,72}]},
{gm,check_neighbours,1,[]},
{gm,handle_info,2,[]},
{gen_server2,handle_msg,2,[]},
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}
=ERROR REPORT==== 19-Feb-2014::18:34:30 ===
** Generic server <0.283.0> terminating
** Last message in was {'DOWN',#Ref<0.0.0.248454>,process,<5383.282.0>,
noconnection}
** When Server state == {state,
{67,<0.283.0>},
{{70,<5383.282.0>},#Ref<0.0.0.248454>},
{{73,<5066.282.0>},#Ref<0.0.1.42352>},
{resource,<<"/IEC">>,queue,<<"event.queue">>},
rabbit_mirror_queue_coordinator,
{74,
[{{67,<0.283.0>},
{view_member,
{67,<0.283.0>},
[],
{70,<5383.282.0>},
{73,<5066.282.0>}}},
{{70,<5383.282.0>},
{view_member,
{70,<5383.282.0>},
[],
{73,<5066.282.0>},
{67,<0.283.0>}}},
{{73,<5066.282.0>},
{view_member,
{73,<5066.282.0>},
[],
{67,<0.283.0>},
{70,<5383.282.0>}}}]},
212075,
[{{67,<0.283.0>},{member,{[],[]},212075,212075}},
{{70,<5383.282.0>},{member,{[],[]},1,1}},
{{73,<5066.282.0>},{member,{[],[]},0,0}}],
[<0.1271.0>],
{[],[]},
[],undefined,
#Fun<rabbit_misc.execute_mnesia_transaction.1>}
** Reason for termination ==
** {function_clause,
[{orddict,fetch,
[{67,<0.283.0>},
[{{73,<5066.282.0>},
{view_member,
{73,<5066.282.0>},
[{70,<5383.282.0>}],
{73,<5066.282.0>},
{73,<5066.282.0>}}}]],
[{file,"orddict.erl"},{line,72}]},
{gm,check_neighbours,1,[]},
{gm,handle_info,2,[]},
{gen_server2,handle_msg,2,[]},
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}
=ERROR REPORT==== 19-Feb-2014::18:34:30 ===
** Generic server <0.203.0> terminating
** Last message in was {mnesia_tm,'rabbit at FRA-VSP-32545',
{vote_yes,{tid,10316,<0.203.0>}}}
** When Server state == 1
** Reason for termination ==
** {unexpected_info,{mnesia_tm,'rabbit at FRA-VSP-32545',
{vote_yes,{tid,10316,<0.203.0>}}}}
=ERROR REPORT==== 19-Feb-2014::18:34:30 ===
** Generic server <0.275.0> terminating
** Last message in was {'DOWN',#Ref<0.0.1.38240>,process,<5383.274.0>,
noconnection}
** When Server state == {state,
{70,<0.275.0>},
{{76,<5066.274.0>},#Ref<0.0.1.42305>},
{{73,<5383.274.0>},#Ref<0.0.1.38240>},
{resource,<<"/IEC">>,queue,
<<"activity.queue.dead">>},
rabbit_mirror_queue_coordinator,
{77,
[{{70,<0.275.0>},
{view_member,
{70,<0.275.0>},
[],
{76,<5066.274.0>},
{73,<5383.274.0>}}},
{{73,<5383.274.0>},
{view_member,
{73,<5383.274.0>},
[],
{70,<0.275.0>},
{76,<5066.274.0>}}},
{{76,<5066.274.0>},
{view_member,
{76,<5066.274.0>},
[],
{73,<5383.274.0>},
{70,<0.275.0>}}}]},
6,
[{{70,<0.275.0>},{member,{[],[]},6,6}},
{{73,<5383.274.0>},{member,{[],[]},1,1}},
{{76,<5066.274.0>},{member,{[],[]},0,0}}],
[<0.1273.0>],
{[],[]},
[],undefined,
#Fun<rabbit_misc.execute_mnesia_transaction.1>}
** Reason for termination ==
** {noproc,{gen_server2,call,
[<0.203.0>,
{submit,#Fun<rabbit_misc.6.116010224>},
infinity]}}
=ERROR REPORT==== 19-Feb-2014::18:34:30 ===
** Generic server <0.204.0> terminating
** Last message in was {mnesia_tm,'rabbit at FRA-VSP-32545',
{vote_yes,{tid,10315,<0.204.0>}}}
** When Server state == 2
** Reason for termination ==
** {unexpected_info,{mnesia_tm,'rabbit at FRA-VSP-32545',
{vote_yes,{tid,10315,<0.204.0>}}}}
=ERROR REPORT==== 19-Feb-2014::18:34:30 ===
** Generic server <0.1268.0> terminating
** Last message in was {'$gen_cast',{gm_deaths,[<5066.266.0>,<0.267.0>]}}
** When Server state == {state,
{amqqueue,
{resource,<<"/IEC">>,queue,
<<"gps.queue.dead">>},
true,false,none,[],<0.266.0>,
[<5066.265.0>],
[<5066.265.0>],
[{vhost,<<"/IEC">>},
{name,<<"Queue HA">>},
{pattern,<<".queue">>},
{'apply-to',<<"queues">>},
{definition,
[{<<"ha-mode">>,<<"all">>},
{<<"ha-sync-mode">>,<<"automatic">>}]},
{priority,0}],
[{<5066.266.0>,<5066.265.0>},
{<5383.266.0>,<5383.265.0>}],
[]},
<0.267.0>,
{state,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],
[],[],[]}}},
erlang},
#Fun<rabbit_mirror_queue_master.5.69128381>,
#Fun<rabbit_mirror_queue_master.6.50493311>}
** Reason for termination ==
** {{case_clause,{ok,<5066.265.0>,[]}},
[{rabbit_mirror_queue_coordinator,handle_cast,2,[]},
{gen_server2,handle_msg,2,[]},
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}
=ERROR REPORT==== 19-Feb-2014::18:34:31 ===
** Generic server <0.266.0> terminating
** Last message in was {'EXIT',<0.1268.0>,
{{case_clause,{ok,<5066.265.0>,[]}},
[{rabbit_mirror_queue_coordinator,handle_cast,2,
[]},
{gen_server2,handle_msg,2,[]},
{proc_lib,wake_up,3,
[{file,"proc_lib.erl"},{line,249}]}]}}
** When Server state == {q,
{amqqueue,
{resource,<<"/IEC">>,queue,<<"gps.queue.dead">>},
true,false,none,[],<0.266.0>,
[<5383.265.0>,<5066.265.0>],
[<5066.265.0>,<5383.265.0>],
[{vhost,<<"/IEC">>},
{name,<<"Queue HA">>},
{pattern,<<".queue">>},
{'apply-to',<<"queues">>},
{definition,
[{<<"ha-mode">>,<<"all">>},
{<<"ha-sync-mode">>,<<"automatic">>}]},
{priority,0}],
[{<5066.266.0>,<5066.265.0>},
{<5383.266.0>,<5383.265.0>},
{<0.267.0>,<0.266.0>}],
[]},
none,false,rabbit_mirror_queue_master,
{state,
{resource,<<"/IEC">>,queue,<<"gps.queue.dead">>},
<0.267.0>,<0.1268.0>,rabbit_variable_queue,
{vqstate,
{0,{[],[]}},
{0,{[],[]}},
{delta,undefined,0,undefined},
{0,{[],[]}},
{0,{[],[]}},
0,
{0,nil},
{0,nil},
{qistate,
"d:/tools/RabbitMQ
Server/data/db/rabbit at FRA-VSP-32596-mnesia
/queues/6IXYXKMC8M51EEAXH5MKLR0Q4",
{{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}},
[]},
undefined,0,65536,
#Fun<rabbit_variable_queue.2.81334491>,
{0,nil}},
{{client_msstate,msg_store_persistent,
<<55,209,140,132,77,86,75,214,37,255,72,56,103,92,
154,75>>,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}},
{state,340043,
"d:/tools/RabbitMQ
Server/data/db/rabbit at FRA-VSP-32596-mnesia/msg_store_persistent"},
rabbit_msg_store_ets_index,
"d:/tools/RabbitMQ
Server/data/db/rabbit at FRA-VSP-32596-mnesia/msg_store_persistent",
<0.255.0>,344140,335946,348237,352334},
{client_msstate,msg_store_transient,
<<148,176,200,245,252,25,203,27,190,186,25,104,
217,230,131,35>>,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}},
{state,319558,
"d:/tools/RabbitMQ
Server/data/db/rabbit at FRA-VSP-32596-mnesia/msg_store_transient"},
rabbit_msg_store_ets_index,
"d:/tools/RabbitMQ
Server/data/db/rabbit at FRA-VSP-32596-mnesia/msg_store_transient",
<0.250.0>,323655,315461,327752,331849}},
true,0,0,0,infinity,0,0,0,0,0,
{rates,
{{1392,831016,530070},0},
{{1392,831016,530070},0},
0.0,0.0,
{1392,831128,748070}},
{0,nil},
{0,nil},
{0,nil},
{0,nil},
0,0,
{rates,
{{1392,831016,530070},0},
{{1392,831016,530070},0},
0.0,0.0,
{1392,831128,748070}}},
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}},
[],
{set,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}}},
{queue,[],[],0},
undefined,undefined,undefined,undefined,
{state,fine,5000,undefined},
{0,nil},
undefined,undefined,undefined,
{state,
{dict,0,16,16,8,80,48,
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},
{{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],
[]}}},
delegate},
undefined,undefined,undefined,4,running}
** Reason for termination ==
** {{case_clause,{ok,<5066.265.0>,[]}},
[{rabbit_mirror_queue_coordinator,handle_cast,2,[]},
{gen_server2,handle_msg,2,[]},
{proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}
Any ideas?
Best regards,
* ________________________________________________________________*
*Michaël OULLION*
*Architecte JAVA*
*ND Informatique*
Adresse (1208 route des Pierrelles B.P. 98 BEAUSEMBLANT - 26240
Beausemblant - FRANCE)
Tel. +33 (0)4 75 23 68 07
Visit our web site at www.norbert-dentressangle.com
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140220/4a2828a3/attachment.html>
More information about the rabbitmq-discuss
mailing list