[rabbitmq-discuss] rabbitmq-server crashes hard while consuming 31GB of RAM

Muharem Hrnjadovic mh at foldr3.com
Thu Nov 24 14:08:34 GMT 2011


On 11/23/2011 05:40 PM, Simon MacMullen wrote:
> On 23/11/11 15:05, Muharem Hrnjadovic wrote:
>> Please see the "sudo rabbitmqctl report" link on:
>>      https://bugs.launchpad.net/openquake/+bug/894024/comments/1
> 
> Hi. Thanks for this.
Hello Simon, thank *you* for looking into this.

> There are a number of interesting things going on:
> 
> 1) Your "rabbitmqctl report" invocation shows about 28000 channels
> open and no connections. Did you edit connection details out of this?
No.

> 2) About 9GB of memory is used by binaries. This may indicate that you
> have some large messages around - are your messages particularly
> large?
No, as I said in my response to Ask, the messages are all strings like:

    '::JOB::635::!hazard_curve_poes!0!-6708111403636643040'

> 3) While the memory alarm was set and cleared a number of times, in
> the end the Erlang VM crashed with:
> 
>   no next heap size found: -2106532246, offset 0
>   Aborted
> 
> This error message indicates that the Erlang VM was not available to
> increase the heap size of a particular process as it ran off the end
> of a table of permitted heap sizes. On a 64GB machine the maximum heap
> size is 6.6 Exabytes: I hope you did not run in to that. Unfortunately
> there is a bug in Erlang R13 which garbles the error message above
> (hence why the alleged heap size is a negative number). So:
> 
> 3a) is this a 64 bit machine?
Yes.

> 3b) Can you run:
> 
> su rabbitmq -s /bin/sh -c 'erl -remsh rabbit@`hostname -s` -sname foo -eval "io:format(\"~p\", [lists:sublist(lists:reverse(lists:sort([{process_info(Pid, memory), Pid, process_info(Pid)} || Pid <- processes()])), 30)]), halt()."'
> 
> (all on one line, as root) when rabbit is eating a lot of memory and send us the output? This should let us see if there are individual processes that are eating huge amounts of memory.
Will do as soon as I have reproduced the problem.

> 3c) Can you upgrade the Erlang VM to see if we can get a better
> version of that error message? Updated debs are available at:
> 
>  http://www.erlang-solutions.com/section/132/erlang-otp-packages
> 
> Due to limitations in dpkg you might need to:
> 
> # apt-get remove erlang-base
> # dpkg -i esl-erlang_14.b.4-1_amd64.deb
> # apt-get install erlang-nox
> # dpkg -i rabbitmq-server_2.7.0-1_all.deb
> 
> (Yes, this is less than ideal but we've only just found out about the
> ESL packages...)
Let me provide the answer to 3b) first and we'll take it from there.

Best regards/Mit freundlichen Grüßen

-- 
Muharem Hrnjadovic <mh at foldr3.com>
Public key id   : B2BBFCFC
Key fingerprint : A5A3 CC67 2B87 D641 103F  5602 219F 6B60 B2BB FCFC

-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 900 bytes
Desc: OpenPGP digital signature
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20111124/cef48010/attachment.pgp>


More information about the rabbitmq-discuss mailing list