<div dir="ltr">FYI: I have not been able to reproduce this problem since it happened. Perhaps my servers were somehow in a bad state to begin with. So... I guess you can consider this as low-priority, for your informational purposes only. ;-)<div>
<br></div><div>-Chris</div></div><div class="gmail_extra"><br><br><div class="gmail_quote">On Thu, Aug 15, 2013 at 4:09 PM, Chris <span dir="ltr"><<a href="mailto:stuff@moesel.net" target="_blank">stuff@moesel.net</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div dir="ltr">Hello,<div><br></div><div>I had a running cluster of two RabbitMQ 3.1.1 servers on Redhat 6.2. I left both running and then attempted to upgrade one (via yum). After the upgrade, rabbitmqctl reported the cluster_status was good, but none of my consumers seemed to be working.</div>
<div><br></div><div>I then attempted to upgrade the other, hoping that would fix things, but the upgrade just hung. After killing the upgrade (ctl-c) I noted that I couldn't stop rabbitmq-server anymore (not via service script or rabbitmqctl). I had to kill it manually. After killing it, I re-ran the upgrade and all was well.</div>
<div><br></div><div>Looking in the logs, I then saw a BUNCH of errors with timestamps corresponding to when I upgraded the first server. It seems that didn't go cleanly on the remaining 3.1.1 node and might be responsible for all the trouble. Did I just get unlucky?</div>
<div><br></div><div>Here's the SASL log:</div><div><br></div><div><div>=CRASH REPORT==== 15-Aug-2013::14:27:49 ===</div><div> crasher:</div><div> initial call: gen:init_it/6</div><div> pid: <0.271.0></div>
<div> registered_name: []</div><div> exception exit: {{badmatch,{error,not_found}},</div><div> [{rabbit_mirror_queue_master,stop_all_slaves,2,</div><div> [{file,"src/rabbit_mirror_queue_master.erl"},</div>
<div> {line,179}]},</div><div> {rabbit_mirror_queue_master,delete_and_terminate,2,</div><div> [{file,"src/rabbit_mirror_queue_master.erl"},</div>
<div> {line,175}]},</div><div> {rabbit_amqqueue_process,'-terminate/2-fun-3-',5,</div><div> [{file,"src/rabbit_amqqueue_process.erl"},</div>
<div> {line,162}]},</div><div> {rabbit_amqqueue_process,terminate_shutdown,2,</div><div> [{file,"src/rabbit_amqqueue_process.erl"},</div><div>
{line,272}]},</div><div> {gen_server2,terminate,3,</div><div> [{file,"src/gen_server2.erl"},{line,1031}]},</div><div> {proc_lib,wake_up,3,</div>
<div> [{file,"proc_lib.erl"},{line,249}]}]}</div><div> in function gen_server2:terminate/3 (src/gen_server2.erl, line 1034)</div><div> ancestors: [rabbit_mirror_queue_slave_sup,rabbit_sup,<0.148.0>]</div>
<div> messages: []</div><div> links: [<0.270.0>]</div><div> dictionary: [{guid,{{<a href="tel:3434499189" value="+13434499189" target="_blank">3434499189</a>,622214121,884364685,3594937084},1}}]</div><div>
trap_exit: true</div><div> status: running</div><div> heap_size: 1598</div>
<div> stack_size: 27</div><div> reductions: 9106</div><div> neighbours:</div><div><br></div><div>=SUPERVISOR REPORT==== 15-Aug-2013::14:27:49 ===</div><div> Supervisor: {local,</div><div> rabbit_mirror_queue_slave_sup}</div>
<div> Context: child_terminated</div><div> Reason: {{badmatch,{error,not_found}},</div><div> [{rabbit_mirror_queue_master,stop_all_slaves,2,</div><div> [{file,"src/rabbit_mirror_queue_master.erl"},</div>
<div> {line,179}]},</div><div> {rabbit_mirror_queue_master,delete_and_terminate,2,</div><div> [{file,"src/rabbit_mirror_queue_master.erl"},</div><div>
{line,175}]},</div><div> {rabbit_amqqueue_process,'-terminate/2-fun-3-',5,</div><div> [{file,"src/rabbit_amqqueue_process.erl"},{line,162}]},</div>
<div> {rabbit_amqqueue_process,terminate_shutdown,2,</div><div> [{file,"src/rabbit_amqqueue_process.erl"},{line,272}]},</div><div> {gen_server2,terminate,3,</div>
<div> [{file,"src/gen_server2.erl"},{line,1031}]},</div><div> {proc_lib,wake_up,3,[{file,"proc_lib.erl"},{line,249}]}]}</div><div> Offender: [{pid,<0.271.0>},</div>
<div> {name,rabbit_mirror_queue_slave},</div><div> {mfa,</div><div> {rabbit_mirror_queue_slave,start_link,</div><div> [{amqqueue,</div><div>
{resource,<<"acs">>,queue,</div>
<div> <<"replies.4a0e284c-1662-463a-b363-cbb4e9557266">>},</div><div> true,false,none,</div><div> [{<<"x-expires">>,signedint,600000}],</div>
<div> <7111.3423.0>,[],[],</div><div> [{vhost,<<"acs">>},</div><div> {name,<<"ha-acs">>},</div>
<div> {pattern,<<".*">>},</div><div> {definition,</div><div> [{<<"ha-mode">>,<<"exactly">>},</div>
<div> {<<"ha-params">>,2}]},</div><div> {priority,0}],</div><div> [{<7111.3424.0>,<7111.3423.0>},</div>
<div> {<7111.8011.82>,<7111.8010.82>},</div><div> {<0.27964.278>,<0.27962.278>}]}]}},</div><div> {restart_type,temporary},</div>
<div> {shutdown,4294967295},</div><div> {child_type,worker}]</div></div><div><br></div><div>Thanks!</div><span class="HOEnZb"><font color="#888888"><div>Chris</div></font></span></div>
</blockquote></div><br></div>