[rabbitmq-discuss] Rabbit startup command is hanging

Jason McIntosh mcintoshj at gmail.com
Fri Apr 11 15:20:29 BST 2014


SO a final email on this.  I ended up having to kill all the processes on
all nodes in the cluster, then starting them back up in order to recover.
At that point, the node that wouldn't rejoin the cluster came online and
started syncing messages and responding fine.  I'm guessing I had a
deadlock someplace though I'm not totally sure where it would be.  I'll
keep an eye on this and see what else I can discover.  *SIGH* I really need
to learn to debug and work with erlang better,
Thanks all,
Jason


On Thu, Apr 10, 2014 at 4:22 PM, Jason McIntosh <mcintoshj at gmail.com> wrote:

> SO now the fun part.  I decided to try and rebuild the middle node (I have
> boxes 10, 11 and 12).  However, I can't get the middle node to reconnect to
> the cluster.  Removing it's mnesia directory allowed it to start, but it
> can't rejoin the cluster.  SO I tried removing the node from the cluster,
> e.g.:
>
> rabbitmqctl -n cluster at rabbitmqm10 forget_cluster_node cluster at rabbitmqm11
>
> But the above never responds - it's just sitting there hanging.
>
> rabbitmqctl -n cluster at rabbitmqm11 status FROM the other nodes all works
> fine. I'm about at a loss as to how the heck to repair things.  I can't
> remove the node from the cluster, I can't start it with the mnesia
> directory in it's current state, and removing the mnesia directory and
> trying to add it back in is failing - it fails with "....done
> (already_member).".  Trying to do rabbitmqctl update_cluster_nodes
> cluster at rabbitmqm10 is sitting there doing nothing and not responding
> either.
>
>
> I'm starting to really worry I'm going to have to completely rebuild my
> cluster...
> Jason
>
>
>
> On Thu, Apr 10, 2014 at 2:55 PM, Jason McIntosh <mcintoshj at gmail.com>wrote:
>
>> Not sure what's going on here.  Just ugpraded my cluster from 3.2.3 to
>> 3.2.4 (including a restart of the machine).  On startup, two of my initial
>> nodes started fine, but when the third node in the cluster started, the
>> "/etc/init.d/rabbitmq-server start" just sits at "Starting rabbitmq-server:
>> " without ever finishing.  Doing a rabbitmqctl status shows:
>> Status of node cluster at rabbitmqm11p ...
>> [{pid,62505},
>>  {running_applications,[{os_mon,"CPO  CXC 138 46","2.2.14"},
>>                         {inets,"INETS  CXC 138 49","5.9.8"},
>>                         {mnesia,"MNESIA  CXC 138 12","4.11"},
>>                         {amqp_client,"RabbitMQ AMQP Client","3.2.4"},
>>                         {xmerl,"XML parser","1.3.6"},
>>                         {eldap,"Ldap api","1.0.2"},
>>                         {sasl,"SASL  CXC 138 11","2.3.4"},
>>                         {stdlib,"ERTS  CXC 138 10","1.19.4"},
>>                         {kernel,"ERTS  CXC 138 10","2.16.4"}]},
>>  {os,{unix,linux}},
>>  {erlang_version,"Erlang R16B03-1 (erts-5.10.4) [source] [64-bit]
>> [smp:24:24] [async-threads:30] [hipe] [kernel-poll:true]\n"},
>>  {memory,[{total,48504352},
>>           {connection_procs,2808},
>>           {queue_procs,0},
>>           {plugins,0},
>>           {other_proc,16290632},
>>           {mnesia,1783536},
>>           {mgmt_db,0},
>>           {msg_index,0},
>>           {other_ets,1120896},
>>           {binary,725448},
>>           {code,19691642},
>>           {atom,703377},
>>           {other_system,8186013}]},
>>  {file_descriptors,[{total_limit,12188},
>>                     {total_used,0},
>>                     {sockets_limit,10967},
>>                     {sockets_used,0}]},
>>  {processes,[{limit,1048576},{used,117}]},
>>  {run_queue,0},
>>  {uptime,83}]
>> ...done.
>>
>>
>> In the web management interface, I see this:
>> Node statistics not available
>> Memory details
>>
>>  Connections 2.7kB  Queues 0B  Plugins 0B  Other process memory 16MB
>> Mnesia 1.7MB  Message store index 0B  Management database 0B  Other ETS
>> tables 1.1MB   Binaries 708kB  Code 19MB  Atoms 687kB  Other system 7.8MB
>>
>>
>> SO rabbit appears to have sort of started, but certain things are not
>> started (e.g. plugins).  Plugins list is:
>> [e] amqp_client                       3.2.4
>> [ ] cowboy                            0.5.0-rmq3.2.4-git4b93c2d
>> [ ] eldap                             3.2.4-gite309de4
>> [e] mochiweb                          2.7.0-rmq3.2.4-git680dba8
>> [ ] rabbitmq_amqp1_0                  3.2.4
>> [E] rabbitmq_auth_backend_ldap        3.2.4
>> [ ] rabbitmq_auth_mechanism_ssl       3.2.4
>> [E] rabbitmq_consistent_hash_exchange 3.2.4
>> [E] rabbitmq_federation               3.2.4
>> [E] rabbitmq_federation_management    3.2.4
>> [ ] rabbitmq_jsonrpc                  3.2.4
>> [ ] rabbitmq_jsonrpc_channel          3.2.4
>> [ ] rabbitmq_jsonrpc_channel_examples 3.2.4
>> [E] rabbitmq_management               3.2.4
>> [E] rabbitmq_management_agent         3.2.4
>> [E] rabbitmq_management_visualiser    3.2.4
>> [ ] rabbitmq_mqtt                     3.2.4
>> [E] rabbitmq_shovel                   3.2.4
>> [E] rabbitmq_shovel_management        3.2.4
>> [ ] rabbitmq_stomp                    3.2.4
>> [ ] rabbitmq_tracing                  3.2.4
>> [e] rabbitmq_web_dispatch             3.2.4
>> [ ] rabbitmq_web_stomp                3.2.4
>> [ ] rabbitmq_web_stomp_examples       3.2.4
>> [ ] rfc4627_jsonrpc                   3.2.4-git5e67120
>> [ ] sockjs                            0.3.4-rmq3.2.4-git3132eb9
>> [e] webmachine                        1.10.3-rmq3.2.4-gite9359c7
>>
>>
>> Any suggestions on next steps on debugging this?  Or what I can do to get
>> this back up and in a "healthy" state?
>>
>> Thanks!
>> Jason
>>
>>
>>
>>
>> --
>> Jason McIntosh
>> https://github.com/jasonmcintosh/
>> 573-424-7612
>>
>
>
>
> --
> Jason McIntosh
> https://github.com/jasonmcintosh/
> 573-424-7612
>



-- 
Jason McIntosh
https://github.com/jasonmcintosh/
573-424-7612
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140411/1ea41ba8/attachment.html>


More information about the rabbitmq-discuss mailing list