[rabbitmq-discuss] RabbitMQ crashes hard when it runs out of memory

Stephen Day sjaday at gmail.com
Fri Oct 23 01:55:32 BST 2009


I am not quite sure on the function evaluation order, but it might help to
know that <0.159.0> is the disk_log process:

<0.159.0>             disk_log:init/2                       1597   879997
0
                      disk_log:loop/1
4

_steve

On Thu, Oct 22, 2009 at 5:47 PM, Stephen Day <sjaday at gmail.com> wrote:

> I won't bore you with all the output, but I tracked down the binary usage
> to these two processes:
>
> [{Pid1, _Info, _Bin}, {Pid2, _Info2, _Bin2} | Other ] = [{P,
> process_info(P), BinInfo} || {P, {binary, BinInfo}} <- [{P, process_info(P,
> binary)} || P <- processes()], length(BinInfo) > 100000].
>
> <0.157.0>             gen:init_it/6                      1682835
> 1873131    0
>                       gen_server2:process_next_msg/8
> 13
> <0.158.0>             rabbit_persister:init/1           19590700
> 29789397    0
> rabbit_persister      gen_server2:process_next_msg/8
> 13
>
> I tried your suggestion to free memory and check, but it looks most was
> held up in the persister:
>
> 35> M = [{erlang:garbage_collect(P), memory(total)} || P <-
> erlang:processes()].
>
> 51> [{P,Mem} || {Mem, P} <- lists:zip( [Me||{true, Me} <- M], processes()),
> Mem < 842757048].
> [{<0.148.0>,842753448},
>  {<0.149.0>,842700248},
>  {<0.150.0>,842700248},
>  {<0.151.0>,842700248},
>  {<0.152.0>,842697224},
>  {<0.154.0>,842697792},
>  {<0.155.0>,842724104},
>  {<0.156.0>,842712824},
>  {<0.157.0>,825951032},
>  {<0.158.0>,602886872},
>  {<0.159.0>,345002144},
>  {<0.177.0>,345002144},
>  {<0.178.0>,345002144},
>  {<0.179.0>,345002144},
>  {<0.180.0>,345002144},
>  {<0.181.0>,345002144},
>  {<0.182.0>,345002144},
>  {<0.183.0>,345002144},
>  {<0.184.0>,345002144},
>  {<0.245.0>,345000624},
>  {<0.247.0>,345001520},
>  {<0.248.0>,344996984},
>  {<0.249.0>,344995464},
>  {<0.250.0>,344995512},
>  {<0.252.0>,344996416},
>  {<0.253.0>,344991880},
>  {<0.254.0>,344991928},
>  {<0.261.0>,...},
>  {...}|...]
>
> So it looks like the large chunks are held up between between gen_server2
> and rabbit_persister.
>
> _steve
>
>
> On Thu, Oct 22, 2009 at 4:24 PM, Matthias Radestock <matthias at lshift.net>wrote:
>
>> Stephen,
>>
>> Stephen Day wrote:
>>
>>> (rabbit at vs-dfw-ctl11)5> [erlang:garbage_collect(P) || P <-
>>> erlang:processes()].
>>> [true,true,true,true,true,true,true,true,true,true,true,
>>>  true,true,true,true,true,true,true,true,true,true,true,true,
>>>  true,true,true,true,true,true|...]
>>>
>>> (rabbit at vs-dfw-ctl11)6> memory().
>>>       [{total,145833144},
>>>  {processes,50900752},
>>>  {processes_used,50896864},
>>>  {system,94932392},
>>>  {atom,514765},
>>>  {atom_used,488348},
>>>  {binary,24622512},
>>>  {code,3880064},
>>>  {ets,64745716}]
>>>
>>> This really cut down on usage, so its likely that the binary gc is
>>> falling behind rabbits requirements.
>>>
>>
>> Agreed.
>>
>>  How do I track down the uncollected binary heap usage to a process?
>>>
>>
>> Binaries are shared between processes and ref counted, so no single
>> process owns them. There is a process_info item called 'binary' that
>> provides information on the binaries referenced by a process, but I've never
>> looked at that myself, so don't know how useful the contained info is.
>>
>> One thing you could try is to run the above garbage_collect code
>> interleaved with the memory reporting code to identify which process results
>> in the biggest drop in memory memory usage when gc'ed.
>>
>>
>> Regards,
>>
>> Matthias.
>>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20091022/9d90bf03/attachment.htm 


More information about the rabbitmq-discuss mailing list