[rabbitmq-discuss] Hang while waiting for shutdown of last node in an HA cluster
Matt Pietrek
mpietrek at skytap.com
Mon Jul 16 21:23:31 BST 2012
We have a 3 node cluster running 2.8.4 supporting a small number of HA
queues. During a controlled shutdown, we bring down each broker (play,
play2, util) in sequence using something along the lines of
rabbitmqctl stop <pidfile>
The first two brokers shut down without incident. The last shutdown gets
stuck and never finishes.
I see the beam.smp processes running for both the broker and rabbitmqctl
instance.
The log file has the following:
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
rabbit on node rabbit at play2 down
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
Mirrored-queue (queue 'unit_test' in vhost '/'): Master <rabbit at play.2.271.0>
saw deaths of mirrors <rabbit at play2.3.296.0>
=ERROR REPORT==== 16-Jul-2012::13:06:19 ===
** Generic server rabbit_node_monitor terminating
** Last message in was {'DOWN',#Ref<0.0.0.1578>,process,
{rabbit,rabbit at play2},
normal}
** When Server state == [rabbit at play2]
** Reason for termination ==
** {bad_return_value,
{error,
{badarg,
[{erlang,is_process_alive,[<3173.371.0>]},
{rabbit_amqqueue,'-on_node_down/1-fun-1-',8},
{qlc,collect,1},
{qlc,eval,2},
{rabbit_amqqueue,'-on_node_down/1-fun-16-',1},
{mnesia_tm,apply_fun,3},
{mnesia_tm,execute_transaction,5},
{rabbit_misc,'-execute_mnesia_transaction/1-fun-0-',1}]}}}
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
Mirrored-queue (queue 'fake_munger_queue' in vhost '/'): Master
<rabbit at play.2.270.0> saw deaths of mirrors <rabbit at play2.3.287.0>
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
Mirrored-queue (queue 'random' in vhost '/'): Master <rabbit at play.2.269.0>
saw deaths of mirrors <rabbit at play2.3.278.0>
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
Mirrored-queue (queue 'charon' in vhost '/'): Master <rabbit at play.2.276.0>
saw deaths of mirrors <rabbit at play2.3.291.0>
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
Mirrored-queue (queue 'ConfigurationManager' in vhost '/'): Master
<rabbit at play.2.275.0> saw deaths of mirrors <rabbit at play2.3.289.0>
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
Mirrored-queue (queue 'fake_service_2' in vhost '/'): Master
<rabbit at play.2.272.0> saw deaths of mirrors <rabbit at play2.3.275.0>
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
Mirrored-queue (queue 'system_test' in vhost '/'): Master
<rabbit at play.2.277.0> saw deaths of mirrors <rabbit at play2.3.293.0>
=INFO REPORT==== 16-Jul-2012::13:06:19 ===
Mirrored-queue (queue 'fake_configmgr' in vhost '/'): Master
<rabbit at play.2.273.0> saw deaths of mirrors <rabbit at play2.3.281.0>
=INFO REPORT==== 16-Jul-2012::13:06:32 ===
application: rabbitmq_tracing
exited: stopped
type: permanent
=INFO REPORT==== 16-Jul-2012::13:06:32 ===
Stopping Rabbit
--------
Since I've turned on Mnesia logging, the rabbitmq-server console spew is:
Mnesia(rabbit at play): Logging mnesia_down rabbit at play2
Mnesia(rabbit at play): Got mnesia_down from rabbit at play2, reconfiguring...
Mnesia(rabbit at play): Transaction {tid,1786,<0.176.0>} calling
#Fun<rabbit_mirror_queue_misc.0.102623438> with [] failed:
{bad_commit,rabbit at play2}
Mnesia(rabbit at play): Restarting transaction {tid,1786,<0.176.0>}: in 2ms
{bad_commit,rabbit at play2}
Mnesia(rabbit at play): Transaction {tid,1785,<0.175.0>} calling
#Fun<rabbit_mirror_queue_misc.0.102623438> with [] failed:
{bad_commit,rabbit at play2}
Mnesia(rabbit at play): Restarting transaction {tid,1785,<0.175.0>}: in 2ms
{bad_commit,rabbit at play2}
Mnesia(rabbit at play): write performed by {tid,1787,<0.176.0>} on record:
{rabbit_queue,{resource,<<"/">>,queue,<<"fake_service_2">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.272.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1787,<0.176.0>} on record:
{rabbit_durable_queue,{resource,<<"/">>,queue,<<"fake_service_2">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.272.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1788,<0.176.0>} on record:
{rabbit_queue,{resource,<<"/">>,queue,<<"fake_queue">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.270.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1788,<0.176.0>} on record:
{rabbit_durable_queue,{resource,<<"/">>,queue,<<"fake_queue">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.270.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1789,<0.175.0>} on record:
{rabbit_queue,{resource,<<"/">>,queue,<<"CM">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.275.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1789,<0.175.0>} on record:
{rabbit_durable_queue,
{resource,<<"/">>,queue,<<"CM">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.275.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1790,<0.175.0>} on record:
{rabbit_queue,{resource,<<"/">>,queue,<<"cha">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.276.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1790,<0.175.0>} on record:
{rabbit_durable_queue,{resource,<<"/">>,queue,<<"cha">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.276.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1791,<0.175.0>} on record:
{rabbit_queue,{resource,<<"/">>,queue,<<"random">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.269.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1791,<0.175.0>} on record:
{rabbit_durable_queue,{resource,<<"/">>,queue,<<"random">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.269.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1793,<0.175.0>} on record:
{rabbit_queue,{resource,<<"/">>,queue,<<"system_test">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.277.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1793,<0.175.0>} on record:
{rabbit_durable_queue,{resource,<<"/">>,queue,<<"system_test">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.277.0>,[],all}
Mnesia(rabbit at play): Transaction {tid,1794,<0.175.0>} calling
#Fun<rabbit_amqqueue.22.63484291> with [] failed:
{badarg,[{erlang,is_process_alive,[<3173.371.0>]},
{rabbit_amqqueue,'-on_node_down/1-fun-1-',8},
{qlc,collect,1},
{qlc,eval,2},
{rabbit_amqqueue,'-on_node_down/1-fun-16-',1},
{mnesia_tm,apply_fun,3},
{mnesia_tm,execute_transaction,5},
{rabbit_misc,'-execute_mnesia_transaction/1-fun-0-',1}]}
Mnesia(rabbit at play): write performed by {tid,1792,<0.176.0>} on record:
{rabbit_queue,{resource,<<"/">>,queue,<<"fake_CM">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.273.0>,[],all}
Mnesia(rabbit at play): write performed by {tid,1792,<0.176.0>} on record:
{rabbit_durable_queue,{resource,<<"/">>,queue,<<"fake_CM">>},
true,false,none,
[{<<"x-ha-policy">>,longstr,<<"all">>}],
<0.273.0>,[],all}
Mnesia(rabbit at play): Transaction log dump initiated by time_threshold:
{needs_dump,45}
Mnesia(rabbit at play): Transaction log dump initiated by time_threshold:
already_dumped
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120716/7aed47de/attachment.htm>
More information about the rabbitmq-discuss
mailing list