[rabbitmq-discuss] Rabbit startup command is hanging
Jason McIntosh
mcintoshj at gmail.com
Thu Apr 10 22:22:40 BST 2014
SO now the fun part. I decided to try and rebuild the middle node (I have
boxes 10, 11 and 12). However, I can't get the middle node to reconnect to
the cluster. Removing it's mnesia directory allowed it to start, but it
can't rejoin the cluster. SO I tried removing the node from the cluster,
e.g.:
rabbitmqctl -n cluster at rabbitmqm10 forget_cluster_node cluster at rabbitmqm11
But the above never responds - it's just sitting there hanging.
rabbitmqctl -n cluster at rabbitmqm11 status FROM the other nodes all works
fine. I'm about at a loss as to how the heck to repair things. I can't
remove the node from the cluster, I can't start it with the mnesia
directory in it's current state, and removing the mnesia directory and
trying to add it back in is failing - it fails with "....done
(already_member).". Trying to do rabbitmqctl update_cluster_nodes
cluster at rabbitmqm10 is sitting there doing nothing and not responding
either.
I'm starting to really worry I'm going to have to completely rebuild my
cluster...
Jason
On Thu, Apr 10, 2014 at 2:55 PM, Jason McIntosh <mcintoshj at gmail.com> wrote:
> Not sure what's going on here. Just ugpraded my cluster from 3.2.3 to
> 3.2.4 (including a restart of the machine). On startup, two of my initial
> nodes started fine, but when the third node in the cluster started, the
> "/etc/init.d/rabbitmq-server start" just sits at "Starting rabbitmq-server:
> " without ever finishing. Doing a rabbitmqctl status shows:
> Status of node cluster at rabbitmqm11p ...
> [{pid,62505},
> {running_applications,[{os_mon,"CPO CXC 138 46","2.2.14"},
> {inets,"INETS CXC 138 49","5.9.8"},
> {mnesia,"MNESIA CXC 138 12","4.11"},
> {amqp_client,"RabbitMQ AMQP Client","3.2.4"},
> {xmerl,"XML parser","1.3.6"},
> {eldap,"Ldap api","1.0.2"},
> {sasl,"SASL CXC 138 11","2.3.4"},
> {stdlib,"ERTS CXC 138 10","1.19.4"},
> {kernel,"ERTS CXC 138 10","2.16.4"}]},
> {os,{unix,linux}},
> {erlang_version,"Erlang R16B03-1 (erts-5.10.4) [source] [64-bit]
> [smp:24:24] [async-threads:30] [hipe] [kernel-poll:true]\n"},
> {memory,[{total,48504352},
> {connection_procs,2808},
> {queue_procs,0},
> {plugins,0},
> {other_proc,16290632},
> {mnesia,1783536},
> {mgmt_db,0},
> {msg_index,0},
> {other_ets,1120896},
> {binary,725448},
> {code,19691642},
> {atom,703377},
> {other_system,8186013}]},
> {file_descriptors,[{total_limit,12188},
> {total_used,0},
> {sockets_limit,10967},
> {sockets_used,0}]},
> {processes,[{limit,1048576},{used,117}]},
> {run_queue,0},
> {uptime,83}]
> ...done.
>
>
> In the web management interface, I see this:
> Node statistics not available
> Memory details
>
> Connections 2.7kB Queues 0B Plugins 0B Other process memory 16MB
> Mnesia 1.7MB Message store index 0B Management database 0B Other ETS
> tables 1.1MB Binaries 708kB Code 19MB Atoms 687kB Other system 7.8MB
>
>
> SO rabbit appears to have sort of started, but certain things are not
> started (e.g. plugins). Plugins list is:
> [e] amqp_client 3.2.4
> [ ] cowboy 0.5.0-rmq3.2.4-git4b93c2d
> [ ] eldap 3.2.4-gite309de4
> [e] mochiweb 2.7.0-rmq3.2.4-git680dba8
> [ ] rabbitmq_amqp1_0 3.2.4
> [E] rabbitmq_auth_backend_ldap 3.2.4
> [ ] rabbitmq_auth_mechanism_ssl 3.2.4
> [E] rabbitmq_consistent_hash_exchange 3.2.4
> [E] rabbitmq_federation 3.2.4
> [E] rabbitmq_federation_management 3.2.4
> [ ] rabbitmq_jsonrpc 3.2.4
> [ ] rabbitmq_jsonrpc_channel 3.2.4
> [ ] rabbitmq_jsonrpc_channel_examples 3.2.4
> [E] rabbitmq_management 3.2.4
> [E] rabbitmq_management_agent 3.2.4
> [E] rabbitmq_management_visualiser 3.2.4
> [ ] rabbitmq_mqtt 3.2.4
> [E] rabbitmq_shovel 3.2.4
> [E] rabbitmq_shovel_management 3.2.4
> [ ] rabbitmq_stomp 3.2.4
> [ ] rabbitmq_tracing 3.2.4
> [e] rabbitmq_web_dispatch 3.2.4
> [ ] rabbitmq_web_stomp 3.2.4
> [ ] rabbitmq_web_stomp_examples 3.2.4
> [ ] rfc4627_jsonrpc 3.2.4-git5e67120
> [ ] sockjs 0.3.4-rmq3.2.4-git3132eb9
> [e] webmachine 1.10.3-rmq3.2.4-gite9359c7
>
>
> Any suggestions on next steps on debugging this? Or what I can do to get
> this back up and in a "healthy" state?
>
> Thanks!
> Jason
>
>
>
>
> --
> Jason McIntosh
> https://github.com/jasonmcintosh/
> 573-424-7612
>
--
Jason McIntosh
https://github.com/jasonmcintosh/
573-424-7612
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140410/6559a023/attachment.html>
More information about the rabbitmq-discuss
mailing list