[rabbitmq-discuss] Rabbit startup command is hanging

Jason McIntosh mcintoshj at gmail.com
Thu Apr 10 22:22:40 BST 2014


SO now the fun part.  I decided to try and rebuild the middle node (I have
boxes 10, 11 and 12).  However, I can't get the middle node to reconnect to
the cluster.  Removing it's mnesia directory allowed it to start, but it
can't rejoin the cluster.  SO I tried removing the node from the cluster,
e.g.:

rabbitmqctl -n cluster at rabbitmqm10 forget_cluster_node cluster at rabbitmqm11

But the above never responds - it's just sitting there hanging.

rabbitmqctl -n cluster at rabbitmqm11 status FROM the other nodes all works
fine. I'm about at a loss as to how the heck to repair things.  I can't
remove the node from the cluster, I can't start it with the mnesia
directory in it's current state, and removing the mnesia directory and
trying to add it back in is failing - it fails with "....done
(already_member).".  Trying to do rabbitmqctl update_cluster_nodes
cluster at rabbitmqm10 is sitting there doing nothing and not responding
either.


I'm starting to really worry I'm going to have to completely rebuild my
cluster...
Jason



On Thu, Apr 10, 2014 at 2:55 PM, Jason McIntosh <mcintoshj at gmail.com> wrote:

> Not sure what's going on here.  Just ugpraded my cluster from 3.2.3 to
> 3.2.4 (including a restart of the machine).  On startup, two of my initial
> nodes started fine, but when the third node in the cluster started, the
> "/etc/init.d/rabbitmq-server start" just sits at "Starting rabbitmq-server:
> " without ever finishing.  Doing a rabbitmqctl status shows:
> Status of node cluster at rabbitmqm11p ...
> [{pid,62505},
>  {running_applications,[{os_mon,"CPO  CXC 138 46","2.2.14"},
>                         {inets,"INETS  CXC 138 49","5.9.8"},
>                         {mnesia,"MNESIA  CXC 138 12","4.11"},
>                         {amqp_client,"RabbitMQ AMQP Client","3.2.4"},
>                         {xmerl,"XML parser","1.3.6"},
>                         {eldap,"Ldap api","1.0.2"},
>                         {sasl,"SASL  CXC 138 11","2.3.4"},
>                         {stdlib,"ERTS  CXC 138 10","1.19.4"},
>                         {kernel,"ERTS  CXC 138 10","2.16.4"}]},
>  {os,{unix,linux}},
>  {erlang_version,"Erlang R16B03-1 (erts-5.10.4) [source] [64-bit]
> [smp:24:24] [async-threads:30] [hipe] [kernel-poll:true]\n"},
>  {memory,[{total,48504352},
>           {connection_procs,2808},
>           {queue_procs,0},
>           {plugins,0},
>           {other_proc,16290632},
>           {mnesia,1783536},
>           {mgmt_db,0},
>           {msg_index,0},
>           {other_ets,1120896},
>           {binary,725448},
>           {code,19691642},
>           {atom,703377},
>           {other_system,8186013}]},
>  {file_descriptors,[{total_limit,12188},
>                     {total_used,0},
>                     {sockets_limit,10967},
>                     {sockets_used,0}]},
>  {processes,[{limit,1048576},{used,117}]},
>  {run_queue,0},
>  {uptime,83}]
> ...done.
>
>
> In the web management interface, I see this:
> Node statistics not available
> Memory details
>
>  Connections 2.7kB  Queues 0B  Plugins 0B  Other process memory 16MB
> Mnesia 1.7MB  Message store index 0B  Management database 0B  Other ETS
> tables 1.1MB   Binaries 708kB  Code 19MB  Atoms 687kB  Other system 7.8MB
>
>
> SO rabbit appears to have sort of started, but certain things are not
> started (e.g. plugins).  Plugins list is:
> [e] amqp_client                       3.2.4
> [ ] cowboy                            0.5.0-rmq3.2.4-git4b93c2d
> [ ] eldap                             3.2.4-gite309de4
> [e] mochiweb                          2.7.0-rmq3.2.4-git680dba8
> [ ] rabbitmq_amqp1_0                  3.2.4
> [E] rabbitmq_auth_backend_ldap        3.2.4
> [ ] rabbitmq_auth_mechanism_ssl       3.2.4
> [E] rabbitmq_consistent_hash_exchange 3.2.4
> [E] rabbitmq_federation               3.2.4
> [E] rabbitmq_federation_management    3.2.4
> [ ] rabbitmq_jsonrpc                  3.2.4
> [ ] rabbitmq_jsonrpc_channel          3.2.4
> [ ] rabbitmq_jsonrpc_channel_examples 3.2.4
> [E] rabbitmq_management               3.2.4
> [E] rabbitmq_management_agent         3.2.4
> [E] rabbitmq_management_visualiser    3.2.4
> [ ] rabbitmq_mqtt                     3.2.4
> [E] rabbitmq_shovel                   3.2.4
> [E] rabbitmq_shovel_management        3.2.4
> [ ] rabbitmq_stomp                    3.2.4
> [ ] rabbitmq_tracing                  3.2.4
> [e] rabbitmq_web_dispatch             3.2.4
> [ ] rabbitmq_web_stomp                3.2.4
> [ ] rabbitmq_web_stomp_examples       3.2.4
> [ ] rfc4627_jsonrpc                   3.2.4-git5e67120
> [ ] sockjs                            0.3.4-rmq3.2.4-git3132eb9
> [e] webmachine                        1.10.3-rmq3.2.4-gite9359c7
>
>
> Any suggestions on next steps on debugging this?  Or what I can do to get
> this back up and in a "healthy" state?
>
> Thanks!
> Jason
>
>
>
>
> --
> Jason McIntosh
> https://github.com/jasonmcintosh/
> 573-424-7612
>



-- 
Jason McIntosh
https://github.com/jasonmcintosh/
573-424-7612
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140410/6559a023/attachment.html>


More information about the rabbitmq-discuss mailing list