[rabbitmq-discuss] rabbitmqctl stall/hang when leaving a cluster
Simon MacMullen
simon at rabbitmq.com
Thu Feb 23 11:52:17 GMT 2012
I *think* the stats database is a red herring. You say this happens when
restarting?
On 23/02/12 00:30, Matt Pietrek wrote:
> Let me add some additional information, and re-summarize what I'm seeing.
>
> In our startup script for RabbitMQ we do the following;
>
> rabbitmq-server -detached
> rabbitmqctl status
> <Extract the PID from rabbitmqctl status, write to our PIDFILE>
There's a potential race here if an old server is running (maybe about
to shut down?). rabbitmqctl status could pick up the old pid.
> rabbitmqctl wait PIDFILE
However, rabbitmqctl wait should then detect that the pid has died and
fail. Unless the pid gets reused by the OS but that is presumably unlikely.
But rabbitmqctl wait will wait indefinitely as long as the pid is alive
and not a fully functional rabbit node. So I'd check two things:
1) You should fix that race, it can be done safely:
Do not use rabbitmq-server -detached and rabbitmqctl status to get the
pid. Instead set RABBITMQ_PID_FILE and background the rabbitmq-server
script. You will then *definitely* get the right pid since the script
writes its own pid then execs - no race possible.
2) Capture the stdout of rabbitmq-server when you start it - if
rabbitmqctl wait still hangs, see how far it's got / what it's doing.
Cheers, Simon
--
Simon MacMullen
RabbitMQ, VMware
More information about the rabbitmq-discuss
mailing list