[rabbitmq-discuss] Rabbitmq falling over & losing messages
Toby White
toby.o.h.white at googlemail.com
Mon Dec 1 11:33:05 GMT 2008
On 29 Nov 2008, at 00:06, Ben Hood wrote:
> So are you saying that with the latest version of Rabbit you are still
> losing messages that are marked as persistent (as you indicated in
> your first post)?
Yes.
> Ok, this looks normal for a case when Rabbit runs out of memory
> because you have flooded it with messages.
>
> Currently the only preventative action against this is the
> channel.flow command - see this article for the background :
> http://hopper.squarespace.com/blog/2008/11/9/flow-control-in-rabbitmq.html
>
> ATM producer throttling requires a well behaved client, i.e. one that
> obeys the channel.flow command - the Python client currently isn't
> well behaved in this respect.
Thanks - I'd seen that blog post, but was hoping I wouldn't be running
into flooding issues quite yet!
>> even though, as best I can tell from watching the output of top,
>> the erlang
>> process never took more than about 10% of available memory.
>
> Do you not see anything in the log about an alarm handler for
> memory, e.g.
>
> =INFO REPORT==== 9-Nov-2008::15:13:31 ===
> alarm_handler: {set,{system_memory_high_watermark,[]}}
No alarm handlers of any sort, nor anything obviously to do with memory.
root at domU-12-31-39-02-61-F6:/tmp# grep -i alarm rabbit.log
root at domU-12-31-39-02-61-F6:/tmp# grep -i memory rabbit.log
root at domU-12-31-39-02-61-F6:/tmp#
From the start of message sending to the crash, the server log looks
like:
=INFO REPORT==== 1-Dec-2008::11:25:46 ===
accepted TCP connection on 0.0.0.0:5672 from 127.0.0.1:45049
=INFO REPORT==== 1-Dec-2008::11:25:46 ===
starting TCP connection <0.216.0> from 127.0.0.1:45049
=INFO REPORT==== 1-Dec-2008::11:25:53 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=INFO REPORT==== 1-Dec-2008::11:25:59 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=INFO REPORT==== 1-Dec-2008::11:26:06 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=INFO REPORT==== 1-Dec-2008::11:26:13 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=INFO REPORT==== 1-Dec-2008::11:26:21 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=INFO REPORT==== 1-Dec-2008::11:26:30 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=INFO REPORT==== 1-Dec-2008::11:26:38 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=INFO REPORT==== 1-Dec-2008::11:26:58 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=INFO REPORT==== 1-Dec-2008::11:27:18 ===
Rolling persister log to "/tmp/rabbitmq-rabbit-mnesia/
rabbit_persister.LOG.previous"
=ERROR REPORT==== 1-Dec-2008::11:27:23 ===
connection <0.216.0> (running), channel 1 - error:
{amqp,internal_error,
"commit failed: [{exit,{timeout,{gen_server,call,[<0.212.0>,
{commit,{{1,<0.221.0>},93093}},5000]}}}]",
'tx.commit'}
=INFO REPORT==== 1-Dec-2008::11:27:23 ===
closing TCP connection <0.216.0> from 127.0.0.1:45049
[followed by a dump of the whole queue]
> I find the memory statistic a bit strange - the alarm handler kicks in
> by default at 95%.
>
> Simon is currently looking into a issue with the way Erlang reports on
> memory consumption on Linux - maybe he can can shed some light on what
> may be going on with your installation.
>
> Also, can you give some more details about your environment? Are you
> running Xen?
Yes; this is on an Amazon EC2 instance. Currently, I'm using just a
small instance: (1.7 GB of memory, 160 GB of instance storage, 32-bit
platform) - eventually I'll be running on a larger instance, but I'm
still working my way up to that; I was trying to calibrate my resource
usage when I ran into this issue. It's running mostly Ubuntu Hardy,
but is now using Erlang R12b-3 from Intrepid. Nothing else running on
the instance except essential services (sshd, cron, etc)
The crash occurs consistently at about 10% memory usage. Memory usage
actually increases shortly after the crash, up to about 30-40% or so;
I'm guessing this is erlang formatting the queue object for output to
the log.
Toby
More information about the rabbitmq-discuss
mailing list