[rabbitmq-discuss] RabbitMQ crashes hard when it runs out of memory
Stephen Day
sjaday at gmail.com
Fri Oct 23 01:47:50 BST 2009
I won't bore you with all the output, but I tracked down the binary usage to
these two processes:
[{Pid1, _Info, _Bin}, {Pid2, _Info2, _Bin2} | Other ] = [{P,
process_info(P), BinInfo} || {P, {binary, BinInfo}} <- [{P, process_info(P,
binary)} || P <- processes()], length(BinInfo) > 100000].
<0.157.0> gen:init_it/6 1682835 1873131
0
gen_server2:process_next_msg/8
13
<0.158.0> rabbit_persister:init/1 19590700 29789397
0
rabbit_persister gen_server2:process_next_msg/8
13
I tried your suggestion to free memory and check, but it looks most was held
up in the persister:
35> M = [{erlang:garbage_collect(P), memory(total)} || P <-
erlang:processes()].
51> [{P,Mem} || {Mem, P} <- lists:zip( [Me||{true, Me} <- M], processes()),
Mem < 842757048].
[{<0.148.0>,842753448},
{<0.149.0>,842700248},
{<0.150.0>,842700248},
{<0.151.0>,842700248},
{<0.152.0>,842697224},
{<0.154.0>,842697792},
{<0.155.0>,842724104},
{<0.156.0>,842712824},
{<0.157.0>,825951032},
{<0.158.0>,602886872},
{<0.159.0>,345002144},
{<0.177.0>,345002144},
{<0.178.0>,345002144},
{<0.179.0>,345002144},
{<0.180.0>,345002144},
{<0.181.0>,345002144},
{<0.182.0>,345002144},
{<0.183.0>,345002144},
{<0.184.0>,345002144},
{<0.245.0>,345000624},
{<0.247.0>,345001520},
{<0.248.0>,344996984},
{<0.249.0>,344995464},
{<0.250.0>,344995512},
{<0.252.0>,344996416},
{<0.253.0>,344991880},
{<0.254.0>,344991928},
{<0.261.0>,...},
{...}|...]
So it looks like the large chunks are held up between between gen_server2
and rabbit_persister.
_steve
On Thu, Oct 22, 2009 at 4:24 PM, Matthias Radestock <matthias at lshift.net>wrote:
> Stephen,
>
> Stephen Day wrote:
>
>> (rabbit at vs-dfw-ctl11)5> [erlang:garbage_collect(P) || P <-
>> erlang:processes()].
>> [true,true,true,true,true,true,true,true,true,true,true,
>> true,true,true,true,true,true,true,true,true,true,true,true,
>> true,true,true,true,true,true|...]
>>
>> (rabbit at vs-dfw-ctl11)6> memory().
>> [{total,145833144},
>> {processes,50900752},
>> {processes_used,50896864},
>> {system,94932392},
>> {atom,514765},
>> {atom_used,488348},
>> {binary,24622512},
>> {code,3880064},
>> {ets,64745716}]
>>
>> This really cut down on usage, so its likely that the binary gc is falling
>> behind rabbits requirements.
>>
>
> Agreed.
>
> How do I track down the uncollected binary heap usage to a process?
>>
>
> Binaries are shared between processes and ref counted, so no single process
> owns them. There is a process_info item called 'binary' that provides
> information on the binaries referenced by a process, but I've never looked
> at that myself, so don't know how useful the contained info is.
>
> One thing you could try is to run the above garbage_collect code
> interleaved with the memory reporting code to identify which process results
> in the biggest drop in memory memory usage when gc'ed.
>
>
> Regards,
>
> Matthias.
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20091022/70f3fc7f/attachment.htm
More information about the rabbitmq-discuss
mailing list