[rabbitmq-discuss] Node hung and not responding to rabbitmqctl

Bill Moseley moseley at hank.org
Fri Jan 14 15:01:06 GMT 2011


I've been working with clustering (on a single test machine) and I stopped
the 2nd RAM node w/o problem but now the initial disk node seems
unresponsive.  All of the rabbitmqctl commands hang.   Initially, connecting
to the node would also hang, but not Rabbit doesn't seem to be listening to
the port any more.


$ sudo rabbitmqctl stop_app
Stopping node rabbit at bumby2 ...


After a few minutes I abort and "k" for kill and get the following:

^C
BREAK: (a)bort (c)ontinue (p)roc info (i)nfo (l)oaded
       (v)ersion (k)ill (D)b-tables (d)istribution

k

Process Information

--------------------------------------------------
=proc:<0.38.0>
State: Waiting
Name: inet_gethost_native
Spawned as: inet_gethost_native:server_init/2
Spawned by: <0.37.0>
Started: Fri Jan 14 06:38:10 2011
Message queue length: 0
Number of heap fragments: 0
Heap fragment data: 0
Link list: [#Port<0.288>, <0.37.0>]
Dictionary: [{rid,1},{num_requests,0}]
Reductions: 64
Stack+heap: 233
OldHeap: 0
Heap unused: 190
OldHeap unused: 0
Stack dump:
Program counter: 0xb771c718 (inet_gethost_native:main_loop/1 + 20)
CP: 0x00000000 (invalid)
arity = 0

0xb74eeecc Return addr 0x08201594 (<terminate process normally>)
y(0)
{state,#Port<0.288>,8000,12302,16399,<0.37.0>,4,{statistics,0,0,0,0,0,0,0,0}}
(k)ill (n)ext (r)eturn:

Seems to be a lot of these hanging around:


$ ps auxf | grep gethost
rabbitmq  3700  0.0  0.0   1868   436 ?        Ss   Jan13   0:00  \_
inet_gethost 4
rabbitmq  3701  0.0  0.0   1916   540 ?        S    Jan13   0:00  |   \_
inet_gethost 4
rabbitmq  3995  0.0  0.0   1868   436 ?        Ss   Jan13   0:00  \_
inet_gethost 4
rabbitmq  3996  0.0  0.0   1916   536 ?        S    Jan13   0:00      \_
inet_gethost 4
rabbitmq  4370  0.0  0.0   1868   432 ?        Ss   Jan13   0:00  \_
inet_gethost 4
rabbitmq  4371  0.0  0.0   1916   536 ?        S    Jan13   0:00      \_
inet_gethost 4


And logs:

=WARNING REPORT==== 13-Jan-2011::21:39:43 ===
exception on TCP connection <0.30329.54> from 127.0.0.1:50222
connection_closed_abruptly

=INFO REPORT==== 13-Jan-2011::21:39:43 ===
closing TCP connection <0.30329.54> from 127.0.0.1:50222

=ERROR REPORT==== 14-Jan-2011::06:39:16 ===
** Node rabbitmqctl7173 at bumby2 not responding **
** Removing (timedout) connection **



Ubuntu Erlang1:13.b.1-dfsg-2ubuntu1.1 / RabbitMQ 2.2.0-1

Are there any docs that describe debugging techniques for those of us that
have no Erlang experience?   I'm not even sure what to kill -9 if I had to.
;)

-- 
Bill Moseley
moseley at hank.org
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110114/8adc523a/attachment.htm>


More information about the rabbitmq-discuss mailing list