[rabbitmq-discuss] RabbitMQ waits forever for PID file during startup
Cesar Munoz
cesar.munoz at ammeon.com
Thu Jul 10 15:02:53 BST 2014
Hi Simon,
so we have tried to find the problem with the initial installation, but no
luck yet. It is very difficult to track it, as it is totally
non-deterministic!
In the meantime, we installed the latest version of RabbitMQ, which
includes de set -e fix, but the same issue still happened. Given the output
of ps auxf
https://gist.github.com/anonymous/62239513b154179a8a4e
it looks like
/bin/sh /etc/init.d/rabbitmq-server start
and
/bin/sh /usr/sbin/rabbitmqctl wait /var/run/rabbitmq/pid
were running concurrently. Is there any chance that this fact created some
sort of race condition between these 2 processes that would make the set -e
fix not work?
Cheers,
Cesar.
On 6 June 2014 11:55, Cesar Munoz <cesar.munoz at ammeon.com> wrote:
> Hi Simon,
>
> the ulimits for rabbitmq user are pretty much the same, the only
> difference is that max user processes is set to 1024 instead of 2066207.
>
> About the system itself, it is true that there has to be something strange
> going on if a shell redirection can fail, but I'm checking the
> configuration and I don't see anything specially awkward.
>
> We are using Red Hat 6.4, and these are the parameters that we set in the
> sysctl.conf:
> http://pastebin.com/SfJBwrna
>
> The rest of the parameters in the kickstart file are pretty much the
> standard ones.
> This is an intermittent issue (we are testing how often it happens, so far
> we got 3 failures in 13 installations), so it is harder to track it!
> Either way, restarting the service works, so it looks like whatever causes
> the problem disappears after a while. I've been trying to find what could
> make this non-deterministic, but so far I haven't noticed anything unusual.
>
> Thanks again!
> Cesar.
>
>
> On 6 June 2014 11:27, Simon MacMullen <simon at rabbitmq.com> wrote:
>
>> On 06/06/2014 10:49AM, Cesar Munoz wrote:
>>
>>> Hi Simon,
>>>
>>> the set -e looks like a very good idea, at least the process will return
>>> the failure straight away!
>>>
>>
>> Sure!
>>
>>
>> These are the ulimits:
>>>
>>> [root at ms1 ~]# ulimit -a
>>>
>>
>> <snip>
>>
>> Those are the ulimits which apply to root - maybe they are different for
>> the "rabbitmq" user?
>>
>> But more to the point: we're failing to do something very very simple
>> here, there has to be something weird about this system if echo or shell
>> redirection can fail with an error message about memory allocation.
>>
>> So have you configured anything unusual about this system?
>>
>>
>> Cheers, Simon
>>
>> --
>> Simon MacMullen
>> RabbitMQ, Pivotal
>>
>
>
--
This email and any files transmitted with it are confidential and intended
solely for the use of the individual or entity to whom they are addressed.
If you have received this email in error please notify the system manager.
This message contains confidential information and is intended only for the
individual named. If you are not the named addressee you should not
disseminate, distribute or copy this e-mail.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20140710/193fc455/attachment.html>
More information about the rabbitmq-discuss
mailing list