<html>
<head>
<meta http-equiv="content-type" content="text/html; charset=ISO-8859-1">
</head>
<body text="#000000" bgcolor="#FFFFFF">
When a RabbitMQ cluster node starts back up after a server reboot,
we have experienced (more than a few) cases where the RabbitMQ
server on the node does not completely start.<br>
<br>
This condition persisted even if the rabbit processes were killed
and rabbit manually restarted.<br>
<br>
The only way to get the server to start required a node reset (or
explicit deletion of the mnesia database)<br>
<br>
Are there any suggestions about how to handle this without losing
the state of the node?<br>
<br>
The system process list looked like this:<br>
<br>
<tt># ps aux | grep rabbit</tt><tt><br>
</tt><tt>rabbitmq 1005 0.0 0.0 9888 2788 ? S Jun13
1:01 /usr/lib/erlang/erts-5.10.2/bin/epmd -daemon</tt><tt><br>
</tt><tt>root 15746 0.0 0.0 11232 1708 pts/3 S+ 23:26
0:00 /bin/sh /etc/init.d/rabbitmq-server start</tt><tt><br>
</tt><tt>root 15797 0.0 0.0 11036 1468 pts/3 S+ 23:26
0:00 /bin/sh /usr/sbin/rabbitmqctl wait /var/run/rabbitmq/pid</tt><tt><br>
</tt><tt>rabbitmq 15799 0.0 0.0 11036 1424 ? S 23:26
0:00 /bin/sh /usr/sbin/rabbitmq-server</tt><tt><br>
</tt><tt>rabbitmq 15807 3.1 1.2 599876 47728 ? Sl 23:26
0:03 /usr/lib/erlang/erts-5.10.2/bin/beam -W w -K true -A30 -P
1048576 -- -root /usr/lib/erlang -progname erl -- -home
/var/lib/rabbitmq -- -pa
/usr/lib/rabbitmq/lib/rabbitmq_server-3.2.1/sbin/../ebin -noshell
-noinput -s rabbit boot -sname rabbit@my-rmq-server -boot
start_sasl -config /etc/rabbitmq/rabbitmq -kernel
inet_default_connect_options [{nodelay,true}] -sasl errlog_type
error -sasl sasl_error_logger false -rabbit error_logger
{file,<a class="moz-txt-link-rfc2396E" href="mailto:/var/log/rabbitmq/rabbit@my-rmq-server.log">"/var/log/rabbitmq/rabbit@my-rmq-server.log"</a>} -rabbit
sasl_error_logger
{file,<a class="moz-txt-link-rfc2396E" href="mailto:/var/log/rabbitmq/rabbit@my-rmq-server-sasl.log">"/var/log/rabbitmq/rabbit@my-rmq-server-sasl.log"</a>} -rabbit
enabled_plugins_file "/etc/rabbitmq/enabled_plugins" rabbit
plugins_dir
"/usr/lib/rabbitmq/lib/rabbitmq_server-3.2.1/sbin/../plugins"
-rabbit plugins_expand_dir
"/var/lib/rabbitmq/mnesia/rabbit@my-rmq-server-plugins-expand"
-os_mon start_cpu_sup false -os_mon start_disksup false -os_mon
start_memsup false -mnesia dir
"/var/lib/rabbitmq/mnesia/rabbit@my-rmq-server"</tt><tt><br>
</tt><tt>rabbitmq 15814 0.0 0.0 94432 2636 pts/3 S+ 23:26
0:00 su rabbitmq -s /bin/sh -c /usr/lib/rabbitmq/bin/rabbitmqctl
"wait" "/var/run/rabbitmq/pid"</tt><tt><br>
</tt><tt>rabbitmq 15819 0.2 0.3 106624 14008 pts/3 Sl+ 23:26
0:00 /usr/lib/erlang/erts-5.10.2/bin/beam -- -root /usr/lib/erlang
-progname erl -- -home /var/lib/rabbitmq -- -pa
/usr/lib/rabbitmq/lib/rabbitmq_server-3.2.1/sbin/../ebin -noshell
-noinput -hidden -sname rabbitmqctl15819 -boot start_clean -s
rabbit_control_main -nodename rabbit@my-rmq-server -extra wait
/var/run/rabbitmq/pid</tt><tt><br>
</tt><tt><br>
</tt>This RabbitMQ node showed as an "up" node in the Nodes list in
the management console of another node in the cluster.<br>
<br>
Also, rabbitmqctl returned some results: <br>
<tt><br>
<br>
# rabbitmqctl status</tt><tt><br>
</tt><tt>Status of node 'rabbit@my-rmq-server' ...</tt><tt><br>
</tt><tt>[{pid,1114},</tt><tt><br>
</tt><tt> {running_applications,</tt><tt><br>
</tt><tt> [{os_mon,"CPO CXC 138 46","2.2.12"},</tt><tt><br>
</tt><tt> {inets,"INETS CXC 138 49","5.9.5"},</tt><tt><br>
</tt><tt> {mnesia,"MNESIA CXC 138 12","4.9"},</tt><tt><br>
</tt><tt> {amqp_client,"RabbitMQ AMQP Client","3.2.1"},</tt><tt><br>
</tt><tt> {rabbitmq_auth_mechanism_ssl,</tt><tt><br>
</tt><tt> "RabbitMQ SSL authentication (SASL
EXTERNAL)","3.2.1"},</tt><tt><br>
</tt><tt> {xmerl,"XML parser","1.3.3"},</tt><tt><br>
</tt><tt> {eldap,"Ldap api","1.0.1"},</tt><tt><br>
</tt><tt> {rfc4627_jsonrpc,"JSON RPC
Service","3.2.1-git5e67120"},</tt><tt><br>
</tt><tt> {sasl,"SASL CXC 138 11","2.3.2"},</tt><tt><br>
</tt><tt> {stdlib,"ERTS CXC 138 10","1.19.2"},</tt><tt><br>
</tt><tt> {kernel,"ERTS CXC 138 10","2.16.2"}]},</tt><tt><br>
</tt><tt> {os,{unix,linux}},</tt><tt><br>
</tt><tt> {erlang_version,</tt><tt><br>
</tt><tt> "Erlang R16B01 (erts-5.10.2) [source-bdf5300] [64-bit]
[smp:2:2] [async-threads:30] [hipe] [kernel-poll:true]\n"},</tt><tt><br>
</tt><tt> {memory,</tt><tt><br>
</tt><tt> [{total,44596672},</tt><tt><br>
</tt><tt> {connection_procs,2808},</tt><tt><br>
</tt><tt> {queue_procs,0},</tt><tt><br>
</tt><tt> {plugins,8464},</tt><tt><br>
</tt><tt> {other_proc,15751480},</tt><tt><br>
</tt><tt> {mnesia,1191152},</tt><tt><br>
</tt><tt> {mgmt_db,0},</tt><tt><br>
</tt><tt> {msg_index,0},</tt><tt><br>
</tt><tt> {other_ets,1235896},</tt><tt><br>
</tt><tt> {binary,716136},</tt><tt><br>
</tt><tt> {code,20445199},</tt><tt><br>
</tt><tt> {atom,711569},</tt><tt><br>
</tt><tt> {other_system,4533968}]},</tt><tt><br>
</tt><tt> {file_descriptors,</tt><tt><br>
</tt><tt>
[{total_limit,924},{total_used,0},{sockets_limit,829},{sockets_used,0}]},</tt><tt><br>
</tt><tt> {processes,[{limit,1048576},{used,105}]},</tt><tt><br>
</tt><tt> {run_queue,0},</tt><tt><br>
</tt><tt> {uptime,271}]</tt><tt><br>
</tt><tt>...done.</tt><tt><br>
</tt><tt><br>
</tt>The startup log and rabbitmq log indicated that the node had
started to start up<br>
<tt><br>
</tt><tt># cat startup_log </tt><tt><br>
</tt><tt><br>
</tt><tt> RabbitMQ 3.2.1. Copyright (C) 2007-2013
GoPivotal, Inc.</tt><tt><br>
</tt><tt> ## ## Licensed under the MPL. See
<a class="moz-txt-link-freetext" href="http://www.rabbitmq.com/">http://www.rabbitmq.com/</a></tt><tt><br>
</tt><tt> ## ##</tt><tt><br>
</tt><tt> ########## Logs:
/var/log/rabbitmq/rabbit@my-rmq-server.log</tt><tt><br>
</tt><tt> ###### ##
/var/log/rabbitmq/rabbit@my-rmq-server-sasl.log</tt><tt><br>
</tt><tt> ##########</tt><tt><br>
</tt><tt> Starting broker...</tt><tt><br>
</tt><tt><br>
<br>
</tt><tt># cat <a class="moz-txt-link-abbreviated" href="mailto:rabbit@my-rmq-server.log">rabbit@my-rmq-server.log</a></tt><tt><br>
</tt><tt><br>
</tt><tt>=INFO REPORT==== 25-Jul-2014::17:18:21 ===</tt><tt><br>
</tt><tt>Starting RabbitMQ 3.2.1 on Erlang R16B01</tt><tt><br>
</tt><tt>Copyright (C) 2007-2013 GoPivotal, Inc.</tt><tt><br>
</tt><tt>Licensed under the MPL. See <a class="moz-txt-link-freetext" href="http://www.rabbitmq.com/">http://www.rabbitmq.com/</a></tt><tt><br>
</tt><tt><br>
</tt><tt>=INFO REPORT==== 25-Jul-2014::17:18:21 ===</tt><tt><br>
</tt><tt>node : rabbit@my-rmq-server</tt><tt><br>
</tt><tt>home dir : /var/lib/rabbitmq</tt><tt><br>
</tt><tt>config file(s) : /etc/rabbitmq/rabbitmq.config</tt><tt><br>
</tt><tt>cookie hash : WmWI9mzuXn9u47LQDipY3g==</tt><tt><br>
</tt><tt>log : /var/log/rabbitmq/rabbit@my-rmq-server.log</tt><tt><br>
</tt><tt>sasl log :
/var/log/rabbitmq/rabbit@my-rmq-server-sasl.log</tt><tt><br>
</tt><tt>database dir :
/var/lib/rabbitmq/mnesia/rabbit@my-rmq-server</tt><tt><br>
</tt><tt><br>
</tt><tt>=INFO REPORT==== 25-Jul-2014::17:18:23 ===</tt><tt><br>
</tt><tt>Limiting to approx 924 file handles (829 sockets)</tt><tt><br>
</tt><tt>root@my-rmq-server:/var/log/rabbitmq# </tt><tt><br>
</tt><tt><br>
</tt><br>
Some time had passed without any activity to either the logs, or
files in the mnesia database<br>
<br>
<tt># date</tt><tt><br>
</tt><tt>Fri Jul 25 17:23:56 UTC 2014</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt># ls -lt /var/lib/rabbitmq/mnesia/rabbit@my-rmq-server</tt><tt><br>
</tt><tt>total 148</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 271 Jul 25 17:21
DECISION_TAB.LOG</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 102 Jul 25 17:21
LATEST.LOG</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 171 Jul 25 17:18
nodes_running_at_shutdown</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 317 Jul 25 17:18
cluster_nodes.config</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 137 Jul 25 17:18
rabbit_vhost.DCD</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 640 Jul 25 17:18
rabbit_user.DCD</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 10207 Jul 25 17:18
rabbit_runtime_parameters.DCD</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 20423 Jul 25 17:18
rabbit_durable_route.DCD</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 21020 Jul 25 17:18
rabbit_durable_queue.DCD</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 2724 Jul 25 17:18
rabbit_durable_exchange.DCD</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 850 Jul 25 17:18
rabbit_user_permission.DCD</tt><tt><br>
</tt><tt>drwxr-xr-x 2 rabbitmq rabbitmq 4096 Jul 25 17:16
msg_store_transient</tt><tt><br>
</tt><tt>drwxr-xr-x 2 rabbitmq rabbitmq 4096 Jul 25 17:16
msg_store_persistent</tt><tt><br>
</tt><tt>drwxr-xr-x 170 rabbitmq rabbitmq 12288 Jul 25 17:16 queues</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 28983 Jul 24 23:35
schema.DAT</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 3 Jun 13 09:41
rabbit_serial</tt><tt><br>
</tt><tt>-rw-r--r-- 1 rabbitmq rabbitmq 238 Jun 13 09:41
schema_version</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt></tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt><tt><br>
</tt>
</body>
</html>