[rabbitmq-discuss] rabbitmqctl stall/hang when leaving a cluster

Matt Pietrek mpietrek at skytap.com
Thu Feb 23 00:30:03 GMT 2012


Let me add some additional information, and re-summarize what I'm seeing.

In our startup script for RabbitMQ we do the following;

rabbitmq-server -detached
rabbitmqctl status
<Extract the PID from rabbitmqctl status, write to our PIDFILE>
rabbitmqctl wait PIDFILE

On shutdown, we do:

rabbitmqctl stop PIDFILE
rm PIDFILE

In normal circumstances, this works just fine hundreds of times in a row.
However as mentioned earlier in the thread, sometimes when restarting the
node that had a stats database, the "rabbitmqctl wait" hangs.

Thanks,

Mat


On Wed, Feb 22, 2012 at 3:05 PM, Matt Pietrek <mpietrek at skytap.com> wrote:

> I was able to just now repro this again. Because of some layers of
> scripting, it turns out it's not hanging up when running rabbitmq-server.
> Rather, it's hanging up when running "rabbitmqctl wait <pidfile>"
>
> The output from rabbitmqctl when run on any node is:
>
> Waiting for rabbit at play ...   << Where 'play' is the node that's
> restarting>>
> pid is 21925 ...                     << Where the PID value is different
> depending on which node it's run>>
>
>
> The last few lines of the event log for the 'play' node:
>
>
> =INFO REPORT==== 22-Feb-2012::14:52:08 ===
> Stopping Rabbit
>
> =INFO REPORT==== 22-Feb-2012::14:52:08 ===
>     application: rabbitmq_management
>     exited: stopped
>     type: permanent
>
> =INFO REPORT==== 22-Feb-2012::14:52:08 ===
>     application: rabbitmq_management_agent
>     exited: stopped
>     type: permanent
>
> =INFO REPORT==== 22-Feb-2012::14:52:08 ===
> stopped TCP Listener on 0.0.0.0:5672
>
> =INFO REPORT==== 22-Feb-2012::14:52:08 ===
>     application: rabbit
>     exited: stopped
>     type: permanent
>
> =INFO REPORT==== 22-Feb-2012::14:52:08 ===
>     application: os_mon
>     exited: stopped
>     type: permanent
>
> =INFO REPORT==== 22-Feb-2012::14:52:08 ===
>     application: mnesia
>     exited: stopped
>     type: permanent
>
> =INFO REPORT==== 22-Feb-2012::14:52:08 ===
> Halting Erlang VM
>
> =INFO REPORT==== 22-Feb-2012::14:52:13 ===
> Limiting to approx 924 file handles (829 sockets)
>
>
>
>
> On Wed, Feb 22, 2012 at 10:40 AM, Matt Pietrek <mpietrek at skytap.com>wrote:
>
>> Unfortunately, I don't see anything in the logs. I'll try again. Is there
>> anything I can do on my end to gather more information?
>>
>> Matt
>>
>>
>> On Wed, Feb 22, 2012 at 3:45 AM, Simon MacMullen <simon at rabbitmq.com>wrote:
>>
>>> On 21/02/12 18:49, Matt Pietrek wrote:
>>>
>>>> If I try this action on the node with the stats database, rabbitmqctl
>>>> waits forever and I have to ctrl-c out.  If I then try "rabbitmqctl
>>>> stop", it errors out, saying that the node is down.
>>>>
>>>
>>> Hmm. Needless to say, this does not happen when I try it :(
>>>
>>> Does anything show up in the logs on that node at this point?
>>>
>>> Cheers, Simon
>>>
>>> --
>>> Simon MacMullen
>>> RabbitMQ, VMware
>>>
>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120222/75ee78a8/attachment.htm>


More information about the rabbitmq-discuss mailing list