[rabbitmq-discuss] millions of unack'd messages in a day -- disk store instead of ram?

Thu Apr 30 16:29:30 BST 2009

On Thu, Apr 30, 2009 at 11:11:08AM -0400, Brian Whitman wrote:
> I was able to hammer rabbit with far more messages than I've ever been able
> to store. Once it crossed into swap amazingly write speed only suffered
> slightly (probably because the old messages were swapped out, not the new
> ones), but get speed did suffer. The important part is that the server
> stayed up, even after using 20GB of virtual RAM (millions of large messages)

Yes, this is broadly inline with my blog post at 
http://www.lshift.net/blog/2009/04/02/cranial-surgery-giving-rabbit-more-memory

But the behaviour you're seeing is not typical for a more general purpose
use case. When you're just hammering in messages, the OS can make
reasonably sensible decisions about which pages to evict and in general
all you're asking of the OS is "get me a new empty page which I can fill".

In the get case however, life is much harder. It's really quite likely
that the next page you're going to want back in as also the next page the
OS is going to choose to evict, so in the case where you're doing
1-in-1-out you could well find performance suffers catastrophically. Even
if you're just doing plain gets, until the GC runs (which on its own could
cause a lot of swap thrashing) and frees up pages, you're now asking a
more complex question of the OS - namely "choose a page to evict and in
its place get me this page which contains the messages I need to deliver".
This will typically thus be a write and a read.

Also, as I've recently commented on that blog post, we don't really have
enough details about how the Erlang VM is choosing to organise data and so
it may not be as efficient as we would like.

However, it is obviously a "good thing" that you have managed to get this
to work. Beware though that the broker you are using could have internal
messages timeout on it - this was fixed in bug 20546 which is in our
"default" development branch, not the stable "v1_5" branch. As such, you
may see messages coming out of the broker which indicate strange timeouts
occurred. Thus, if you're wishing to be able to use RabbitMQ in this way,
we would recommend you use the head of the "default" branch.

Also, you need the memory alarms turned off otherwise flow control will
kick in when you start to run out of memory.

Matthew