[rabbitmq-discuss] Rabbitmq falling over & losing messages

Ben Hood 0x6e6562 at gmail.com
Sat Nov 29 00:06:18 GMT 2008


Toby,

On Fri, Nov 28, 2008 at 10:47 AM, Toby White
<toby.o.h.white at googlemail.com> wrote:
> I've pulled the latest repository version, and run the same tests. (This
> also required me to upgrade the version of Erlang I was running on from 11b5
> to 12b3)
> The outcome is basically the same, but I get slightly different errors
> logged.

So are you saying that with the latest version of Rabbit you are still
losing messages that are marked as persistent (as you indicated in
your first post)?

> The publishing script gets an exception like so:
...
> [{exit,{timeout,{gen_server,call,[<0.230.0>,{commit,{{1,<0.247.0>},82082}},5000]}}}]',
> (90, 20), 'Channel.tx_commit')

This client side trace looks normal for a server crash.

> and the rabbit error message is very slightly different:
>
> =ERROR REPORT==== 28-Nov-2008::10:11:40 ===
> connection <0.242.0> (running), channel 1 - error:
> {amqp,internal_error,
>      "commit failed:
> [{exit,{timeout,{gen_server,call,[<0.230.0>,{commit,{{1,<0.247.0>},82082}},5000]}}}]",
>      'tx.commit'}
>
> (followed by a dump of the whole queue as before.)
>
> Furthermore, the following appears on the console:
>
> (rabbit at domU-12-31-39-02-61-F6)3>
> /usr/lib/erlang/lib/os_mon-2.1.6/priv/bin/memsup: Erlang has closed.
>
>                           Erlang has closed
>
> Crash dump was written to: erl_crash.dump
> eheap_alloc: Cannot allocate 729810240 bytes of memory (of type "heap").
> Aborted

Ok, this looks normal for a case when Rabbit runs out of memory
because you have flooded it with messages.

Currently the only preventative action against this is the
channel.flow command - see this article for the background :
http://hopper.squarespace.com/blog/2008/11/9/flow-control-in-rabbitmq.html

ATM producer throttling requires a well behaved client, i.e. one that
obeys the channel.flow command - the Python client currently isn't
well behaved in this respect.

> even though, as best I can tell from watching the output of top, the erlang
> process never took more than about 10% of available memory.

Do you not see anything in the log about an alarm handler for memory, e.g.

=INFO REPORT==== 9-Nov-2008::15:13:31 ===
    alarm_handler: {set,{system_memory_high_watermark,[]}}

?

I find the memory statistic a bit strange - the alarm handler kicks in
by default at 95%.

Simon is currently looking into a issue with the way Erlang reports on
memory consumption on Linux - maybe he can can shed some light on what
may be going on with your installation.

Also, can you give some more details about your environment? Are you
running Xen?

Ben




More information about the rabbitmq-discuss mailing list