[rabbitmq-discuss] Rabbitmq falling over & losing messages

Ben Hood 0x6e6562 at gmail.com
Mon Dec 1 17:00:35 GMT 2008


Toby,

On Mon, Dec 1, 2008 at 11:33 AM, Toby White
<toby.o.h.white at googlemail.com> wrote:
>> Do you not see anything in the log about an alarm handler for memory, e.g.
>>
>> =INFO REPORT==== 9-Nov-2008::15:13:31 ===
>>   alarm_handler: {set,{system_memory_high_watermark,[]}}

I've tried to simulate your test locally and get the same results,
albeit with different figures. The issue is that even with persistent
messaging, you are still bound by memory because Rabbit basically
copies the in-memory copy of the message to the journal. You can see
this by the fact that Erlang tries to allocate ~700MB of memory in one
hit:

> Crash dump was written to: erl_crash.dump
> eheap_alloc: Cannot allocate 729810240 bytes of memory (of type "heap").
> Aborted

Because the increment is so big, the memory supervisor does not get a
chance to set the high watermark and let producer flow control
throttle the rate of production because the interpreter has already
died.

This is caused by the fact that you have nothing consuming messages
and hence the persister log is growing continuously. Because of the
way that the snapshot mechanism currently works, an increasing amount
of memory is required to create snapshots of a growing journal.

A solution to this may be to refactor the persister so that it can
handle large surges in persistent message volumes without having to
rely on something draining it.

However, we are looking into the whole disk based queue paging area in
general and it may be more appropriate to incorporate a solution for
the symptom you are seeing into that work.

Another reason to not just *optimize* the persister is also holistic -
even if you can write stuff down to disk in a smooth fashion, you are
still going to run into the issue that all messages are still kept in
memory.

HTH,

Ben




More information about the rabbitmq-discuss mailing list