<html><head></head><body style="word-wrap: break-word; -webkit-nbsp-mode: space; -webkit-line-break: after-white-space; "><div>Hi Matt,</div><div><br></div><div>Just to keep you up to date - we're still investigating this. I will reply here once we've got to the bottom of it.</div><div><br></div><div>Cheers,</div><div>Tim</div><br><div><div>On 13 Sep 2012, at 21:05, Matt Pietrek wrote:</div><br class="Apple-interchange-newline"><blockquote type="cite">Here you to Tim:<br><br>The scrubbed <a href="mailto:rabbit@mq1.log">rabbit@mq1.log</a>:<br>--------<br>=INFO REPORT==== 13-Sep-2012::12:00:21 ===<br> application: rabbitmq_management<br> exited: stopped<br> type: permanent<br><br>=INFO REPORT==== 13-Sep-2012::12:00:21 ===<br>
application: rabbitmq_federation<br> exited: stopped<br> type: permanent<br><br>=INFO REPORT==== 13-Sep-2012::12:00:21 ===<br>Stopping Rabbit<br><br>=INFO REPORT==== 13-Sep-2012::12:00:21 ===<br> application: rabbitmq_management_agent<br>
exited: stopped<br> type: permanent<br><br>=WARNING REPORT==== 13-Sep-2012::12:00:21 ===<br>Connection (<0.1659.0>) closing: internal error in channel (<0.1666.0>): shutdown<br><br>=INFO REPORT==== 13-Sep-2012::12:00:21 ===<br>
stopped TCP Listener on <a href="http://0.0.0.0:5672/">0.0.0.0:5672</a><br><br>=ERROR REPORT==== 13-Sep-2012::12:17:00 ===<br>** Generic server <0.1676.0> terminating <br>** Last message in was {inet_async,#Port<0.5748>,5106,{error,etimedout}}<br>
** When Server state == {state,#Port<0.5748>,<0.1671.0>,<0.1673.0>,<br> {method,rabbit_framing_amqp_0_9_1},<br> none}<br>** Reason for termination == <br>
** {socket_error,etimedout}<br><br>=ERROR REPORT==== 13-Sep-2012::12:17:00 ===<br>** Generic server <0.1671.0> terminating <br>** Last message in was {socket_error,etimedout}<br>** When Server state == {state,amqp_network_connection,<br>
{state,#Port<0.5748>,0,<0.1675.0>,131072,<br> undefined,false},<br> <0.1670.0>,<0.1673.0>,<br> {amqp_params_network,<<"guest">>,<<"guest">>,<br>
<<"/">>,"<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",<br> 5672,0,0,0,infinity,none,<br> [#Fun<amqp_auth_mechanisms.plain.3>,<br>
#Fun<amqp_auth_mechanisms.amqplain.3>],<br> [],[]},<br> 0,<br> [{<<"capabilities">>,table,<br>
[{<<"publisher_confirms">>,bool,true},<br> {<<"exchange_exchange_bindings">>,bool,true},<br> {<<"basic.nack">>,bool,true},<br>
{<<"consumer_cancel_notify">>,bool,true}]},<br> {<<"copyright">>,longstr,<br> <<"Copyright (C) 2007-2012 VMware, Inc.">>},<br>
{<<"information">>,longstr,<br> <<"Licensed under the MPL. See <a href="http://www.rabbitmq.com/">http://www.rabbitmq.com/</a>">>},<br>
{<<"platform">>,longstr,<<"Erlang/OTP">>},<br> {<<"product">>,longstr,<<"RabbitMQ">>},<br>
{<<"version">>,longstr,<<"2.8.6">>}],<br> #Fun<amqp_connection_sup.0.39273983>,<br> #Fun<amqp_connection_sup.2.54430129>,false}<br>
** Reason for termination == <br>** {socket_error,etimedout}<br><br>=ERROR REPORT==== 13-Sep-2012::12:17:00 ===<br>** Generic server <0.1657.0> terminating<br>** Last message in was {'EXIT',<0.197.0>,shutdown}<br>
** When Server state == {state,<br> {upstream,<br> {amqp_params_network,<<"guest">>,<<"guest">>,<br> <<"/">>,"<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",5672,0,<br>
0,0,infinity,none,<br> [#Fun<amqp_auth_mechanisms.plain.3>,<br> #Fun<amqp_auth_mechanisms.amqplain.3>],<br> [],[]},<br>
<<"skytap">>,1000,1,1,none,none,none,"upstream-slg"},<br> <0.1671.0>,<0.1683.0>,<br> <<"federation: skytap -> <a href="mailto:rabbit@mq1.matt.xyz.com">rabbit@mq1.matt.xyz.com</a>">>,<br>
<<"federation: skytap -> <a href="mailto:rabbit@mq1.matt.xyz.com">rabbit@mq1.matt.xyz.com</a> A">>,<br> {0,nil},<br> 115,<br>
{dict,10,16,16,8,80,48,<br> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],[]},<br> {{[],<br> [[{<<"charon.us-east.default">>,[]}|<br>
{set,1,16,16,8,80,48,<br> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br> []},<br> {{[],[],[],[],[],[],[],[],[],[],[],[],<br>
[{resource,<<"/">>,exchange,<br> <<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br>
[],[],[]}}}]],<br> [],<br> [[{<<"cmcmd.us-east.default">>,[]}|<br> {set,1,16,16,8,80,48,<br>
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br> []},<br> {{[],[],[],[],[],[],[],[],[],[],[],[],<br> [{resource,<<"/">>,exchange,<br>
<<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br> [],[],[]}}}],<br> [{<<"charon.us-west.default">>,[]}|<br>
{set,2,16,16,8,80,48,<br> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br> []},<br> {{[],[],[],<br>
[{resource,<<"/">>,queue,<<"charon">>}],<br> [],[],[],[],[],[],[],[],<br> [{resource,<<"/">>,exchange,<br>
<<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br> [],[],[]}}}]],<br> [],<br>
[[{<<"cmcmd.us-west.default">>,[]}|<br> {set,2,16,16,8,80,48,<br> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br>
[]},<br> {{[],[],[],[],[],<br> [{resource,<<"/">>,queue,<<"cmcmd">>}],<br> [],[],[],[],[],[],<br>
[{resource,<<"/">>,exchange,<br> <<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br>
[],[],[]}}}]],<br> [],<br> [[{<<"system_test.us-east.default">>,[]}|<br> {set,1,16,16,8,80,48,<br>
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br> []},<br> {{[],[],[],[],[],[],[],[],[],[],[],[],<br> [{resource,<<"/">>,exchange,<br>
<<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br> [],[],[]}}}]],<br> [],<br>
[[{<<"system_test.us-west.default">>,[]}|<br> {set,2,16,16,8,80,48,<br> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br>
[]},<br> {{[],<br> [{resource,<<"/">>,queue,<<"system_test">>}],<br> [],[],[],[],[],[],[],[],[],[],<br>
[{resource,<<"/">>,exchange,<br> <<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br>
[],[],[]}}}]],<br> [],[],<br> [[{<<"unit_test.us-east.default">>,[]}|<br> {set,1,16,16,8,80,48,<br>
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br> []},<br> {{[],[],[],[],[],[],[],[],[],[],[],[],<br> [{resource,<<"/">>,exchange,<br>
<<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br> [],[],[]}}}]],<br> [[{<<"ConfigurationManager.us-east.default">>,<br>
[]}|<br> {set,1,16,16,8,80,48,<br> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br> []},<br> {{[],[],[],[],[],[],[],[],[],[],[],[],<br>
[{resource,<<"/">>,exchange,<br> <<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br>
[],[],[]}}}]],<br> [[{<<"unit_test.us-west.default">>,[]}|<br> {set,2,16,16,8,80,48,<br> {[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br>
[]},<br> {{[],[],[],[],[],[],[],[],<br> [{resource,<<"/">>,queue,<<"unit_test">>}],<br>
[],[],[],<br> [{resource,<<"/">>,exchange,<br> <<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br>
[],[],[]}}}]],<br> [[{<<"ConfigurationManager.us-west.default">>,<br> []}|<br> {set,2,16,16,8,80,48,<br>
{[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],<br> []},<br> {{[],[],[],[],[],[],[],<br> [{resource,<<"/">>,queue,<br>
<<"ConfigurationManager">>}],<br> [],[],[],[],<br> [{resource,<<"/">>,exchange,<br>
<<"federation: skytap -> <a href="mailto:rabbit@slg1matt1.xyz.com">rabbit@slg1matt1.xyz.com</a> A">>}],<br> [],[],[]}}}]]}}},<br>
<0.1659.0>,<0.1666.0>,<br> {resource,<<"/">>,exchange,<<"skytap">>},<br> {0,nil}}<br>** Reason for termination == <br>
** {undef,<br> [{rabbit_federation_old_status,report,<br> [{upstream,<br> {amqp_params_network,<<"guest">>,<<"guest">>,<<"/">>,<br>
"<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",5672,0,0,0,infinity,<br> none,<br> [#Fun<amqp_auth_mechanisms.plain.3>,<br> #Fun<amqp_auth_mechanisms.amqplain.3>],<br>
[],[]},<br> <<"skytap">>,1000,1,1,none,none,none,"upstream-slg"},<br> {resource,<<"/">>,exchange,<<"skytap">>},<br>
shutdown],<br> []},<br> {rabbit_federation_link,terminate,2,[]},<br> {gen_server2,terminate,3,[]},<br> {proc_lib,init_p_do_apply,3,[{file,"proc_lib.erl"},{line,227}]}]}<br>
** In 'terminate' callback with reason ==<br>** shutdown<br><br>=INFO REPORT==== 13-Sep-2012::12:17:00 ===<br> application: rabbit<br> exited: stopped<br> type: permanent<br><br>=INFO REPORT==== 13-Sep-2012::12:17:00 ===<br>
application: os_mon<br> exited: stopped<br> type: permanent<br><br>=INFO REPORT==== 13-Sep-2012::12:17:00 ===<br> application: mnesia<br> exited: stopped<br> type: permanent<br><br>=INFO REPORT==== 13-Sep-2012::12:17:00 ===<br>
Halting Erlang VM<br>--------<br><br>And the scrubbed <a href="mailto:rabbit@mq1-sasl.log">rabbit@mq1-sasl.log</a>:<br><br>--------<br>=SUPERVISOR REPORT==== 13-Sep-2012::12:00:21 ===<br> Supervisor: {<0.1664.0>,amqp_channel_sup}<br> Context: child_terminated<br>
Reason: shutdown<br> Offender: [{pid,<0.1666.0>},<br> {name,channel},<br> {mfa,{amqp_channel,start_link,<br> [direct,<0.1659.0>,1,<0.1665.0>,<br>
#Fun<amqp_channel_sup.0.24953263>]}},<br> {restart_type,intrinsic},<br> {shutdown,4294967295},<br> {child_type,worker}]<br><br>
<br>=SUPERVISOR REPORT==== 13-Sep-2012::12:00:21 ===<br> Supervisor: {<0.1664.0>,amqp_channel_sup}<br> Context: shutdown<br> Reason: reached_max_restart_intensity<br> Offender: [{pid,<0.1666.0>},<br>
{name,channel},<br> {mfa,{amqp_channel,start_link,<br> [direct,<0.1659.0>,1,<0.1665.0>,<br> #Fun<amqp_channel_sup.0.24953263>]}},<br>
{restart_type,intrinsic},<br> {shutdown,4294967295},<br> {child_type,worker}]<br><br><br>=CRASH REPORT==== 13-Sep-2012::12:17:00 ===<br> crasher:<br> initial call: amqp_main_reader:init/1<br>
pid: <0.1676.0><br> registered_name: []<br> exception exit: {socket_error,etimedout}<br> in function gen_server:terminate/6 (gen_server.erl, line 747)<br> ancestors: [<0.1674.0>,<0.1670.0>,amqp_sup,<0.240.0>]<br>
messages: []<br> links: [<0.1674.0>]<br> dictionary: []<br> trap_exit: false<br> status: running<br> heap_size: 2584<br> stack_size: 24<br> reductions: 2529<br> neighbours:<br><br>=CRASH REPORT==== 13-Sep-2012::12:17:00 ===<br>
crasher:<br> initial call: amqp_gen_connection:init/1<br> pid: <0.1671.0><br> registered_name: []<br> exception exit: {socket_error,etimedout}<br> in function gen_server:terminate/6 (gen_server.erl, line 747)<br>
ancestors: [<0.1670.0>,amqp_sup,<0.240.0>]<br> messages: []<br> links: [<0.1670.0>,#Port<0.5748>]<br> dictionary: []<br> trap_exit: false<br> status: running<br> heap_size: 1597<br>
stack_size: 24<br> reductions: 805<br> neighbours:<br><br>=SUPERVISOR REPORT==== 13-Sep-2012::12:17:00 ===<br> Supervisor: {<0.1670.0>,amqp_connection_sup}<br> Context: child_terminated<br> Reason: {socket_error,etimedout}<br>
Offender: [{pid,<0.1671.0>},<br> {name,connection},<br> {mfa,<br> {amqp_gen_connection,start_link,<br> [amqp_network_connection,<br>
{amqp_params_network,<<"guest">>,<<"guest">>,<br> <<"/">>,"<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",<br>
5672,0,0,0,infinity,none,<br> [#Fun<amqp_auth_mechanisms.plain.3>,<br> #Fun<amqp_auth_mechanisms.amqplain.3>],<br> [],[]},<br>
#Fun<amqp_connection_sup.0.39273983>,<br> #Fun<amqp_connection_sup.2.54430129>,[]]}},<br> {restart_type,intrinsic},<br> {shutdown,brutal_kill},<br>
{child_type,worker}]<br><br><br>=SUPERVISOR REPORT==== 13-Sep-2012::12:17:00 ===<br> Supervisor: {<0.1670.0>,amqp_connection_sup}<br> Context: shutdown<br> Reason: reached_max_restart_intensity<br>
Offender: [{pid,<0.1671.0>},<br> {name,connection},<br> {mfa,<br> {amqp_gen_connection,start_link,<br> [amqp_network_connection,<br>
{amqp_params_network,<<"guest">>,<<"guest">>,<br> <<"/">>,"<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",<br>
5672,0,0,0,infinity,none,<br> [#Fun<amqp_auth_mechanisms.plain.3>,<br> #Fun<amqp_auth_mechanisms.amqplain.3>],<br> [],[]},<br>
#Fun<amqp_connection_sup.0.39273983>,<br> #Fun<amqp_connection_sup.2.54430129>,[]]}},<br> {restart_type,intrinsic},<br> {shutdown,brutal_kill},<br>
{child_type,worker}]<br><br><br>=SUPERVISOR REPORT==== 13-Sep-2012::12:17:00 ===<br> Supervisor: {<0.1674.0>,<br> amqp_connection_type_sup}<br> Context: child_terminated<br>
Reason: {socket_error,etimedout}<br> Offender: [{pid,<0.1676.0>},<br> {name,main_reader},<br> {mfa,<br> {amqp_main_reader,start_link,<br> [#Port<0.5748>,<0.1671.0>,<0.1673.0>,<br>
{method,rabbit_framing_amqp_0_9_1}]}},<br> {restart_type,transient},<br> {shutdown,4294967295},<br> {child_type,worker}]<br><br><br>=SUPERVISOR REPORT==== 13-Sep-2012::12:17:00 ===<br>
Supervisor: {<0.1674.0>,<br> amqp_connection_type_sup}<br> Context: shutdown<br> Reason: reached_max_restart_intensity<br> Offender: [{pid,<0.1676.0>},<br>
{name,main_reader},<br> {mfa,<br> {amqp_main_reader,start_link,<br> [#Port<0.5748>,<0.1671.0>,<0.1673.0>,<br> {method,rabbit_framing_amqp_0_9_1}]}},<br>
{restart_type,transient},<br> {shutdown,4294967295},<br> {child_type,worker}]<br><br><br>=CRASH REPORT==== 13-Sep-2012::12:17:00 ===<br> crasher:<br> initial call: gen:init_it/6<br>
pid: <0.1657.0><br> registered_name: []<br> exception exit: {undef,<br> [{rabbit_federation_old_status,report,<br> [{upstream,<br> {amqp_params_network,<<"guest">>,<br>
<<"guest">>,<<"/">>,<br> "<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",<br> 5672,0,0,0,infinity,none,<br>
[#Fun<amqp_auth_mechanisms.plain.3>,<br> #Fun<amqp_auth_mechanisms.amqplain.3>],<br> [],[]},<br>
<<"skytap">>,1000,1,1,none,none,none,<br> "upstream-slg"},<br> {resource,<<"/">>,exchange,<<"skytap">>},<br>
shutdown],<br> []},<br> {rabbit_federation_link,terminate,2,[]},<br> {gen_server2,terminate,3,[]},<br> {proc_lib,init_p_do_apply,3,<br>
[{file,"proc_lib.erl"},{line,227}]}]}<br> in function gen_server2:terminate/3 <br> ancestors: [<0.197.0>,<0.194.0>,rabbit_federation_link_sup_sup,<br> rabbit_federation_sup,rabbit_sup,<0.143.0>]<br>
messages: [{'DOWN',#Ref<0.0.0.8909>,process,<0.1683.0>,shutdown}]<br> links: []<br> dictionary: []<br> trap_exit: true<br> status: running<br> heap_size: 2584<br> stack_size: 24<br>
reductions: 9640<br> neighbours:<br><br>=SUPERVISOR REPORT==== 13-Sep-2012::12:17:00 ===<br> Supervisor: {<0.197.0>,<br> rabbit_federation_link_sup}<br> Context: shutdown_error<br>
Reason: {undef,<br> [{rabbit_federation_old_status,report,<br> [{upstream,<br> {amqp_params_network,<<"guest">>,<<"guest">>,<br>
<<"/">>,<br> "<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",5672,0,<br> 0,0,infinity,none,<br>
[#Fun<amqp_auth_mechanisms.plain.3>,<br> #Fun<amqp_auth_mechanisms.amqplain.3>],<br> [],[]},<br> <<"skytap">>,1000,1,1,none,none,none,<br>
"upstream-slg"},<br> {resource,<<"/">>,exchange,<<"skytap">>},<br> shutdown],<br> []},<br>
{rabbit_federation_link,terminate,2,[]},<br> {gen_server2,terminate,3,[]},<br> {proc_lib,init_p_do_apply,3,<br> [{file,"proc_lib.erl"},{line,227}]}]}<br>
Offender: [{pid,<0.1657.0>},<br> {name,<br> {upstream,<br> {amqp_params_network,<<"guest">>,<<"guest">>,<br>
<<"/">>,"<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",<br> 5672,0,0,0,infinity,none,<br> [#Fun<amqp_auth_mechanisms.plain.3>,<br>
#Fun<amqp_auth_mechanisms.amqplain.3>],<br> [],[]},<br> <<"skytap">>,1000,1,1,none,none,none,<br> "upstream-slg"}},<br>
{mfa,<br> {rabbit_federation_link,start_link,<br> [{{upstream,<br> {amqp_params_network,<<"guest">>,<<"guest">>,<br>
<<"/">>,<br> "<a href="http://slg1matt1.xyz.com/">slg1matt1.xyz.com</a>",5672,<br> 0,0,0,infinity,none,<br>
[#Fun<amqp_auth_mechanisms.plain.3>,<br> #Fun<amqp_auth_mechanisms.amqplain.3>],<br> [],[]},<br> <<"skytap">>,1000,1,1,none,none,none,<br>
"upstream-slg"},<br> {resource,<<"/">>,exchange,<<"skytap">>}}]}},<br> {restart_type,{permanent,1}},<br>
{shutdown,4294967295},<br> {child_type,worker}]<br><br>--------<br><br><div class="gmail_quote">On Thu, Sep 13, 2012 at 12:30 PM, Tim Watson <span dir="ltr"><<a href="mailto:tim@rabbitmq.com" target="_blank">tim@rabbitmq.com</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="im"><br>
On 13 Sep 2012, at 20:26, Matt Pietrek wrote:<br>
<br>
> We "own" the network in that it's a direct connection between two datacenters. It's not a particularly high speed connection, but it's generally reliable. I've done some ping tests between the two nodes and while an occasional packet will drop every once in awhile, it's not a really horrible connection. The average ping round trip time is areound 80ms.<br>
><br>
> I'd be surprised if it's a dropped packet, as the shutdown on both brokers seems to consistently take a long time. I just reproed this again, and one broker took 1000 seconds to shut down, while the other took 995. I have the logs from this repro, but a quick glance doesn't show anything radically different. Let me know if there's anything else I can supply.<br>
><br>
<br>
</div>Ok - do send the second pair of logs (or a scrubbed version if you need to hide private data) please, as we're keen to get to the bottom of this.<br>
<div class="HOEnZb"><div class="h5">_______________________________________________<br>
rabbitmq-discuss mailing list<br>
<a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>
<a href="https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss" target="_blank">https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss</a><br>
</div></div></blockquote></div><br>
_______________________________________________<br>rabbitmq-discuss mailing list<br><a href="mailto:rabbitmq-discuss@lists.rabbitmq.com">rabbitmq-discuss@lists.rabbitmq.com</a><br>https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss<br></blockquote></div><br></body></html>