[rabbitmq-discuss] RabbitMQ load issues

Tue Jan 17 20:57:46 GMT 2012

Hi Simon,

Thanks for your reply.  I'll try to clarify/respond in kind below:

<SNIP>

> Hmm. While RabbitMQ is designed to scale :-) 100k queues and 700k bindings
> does start to sound like "a lot".
>
> Having said that I was able to create that number of queues / bindings on my
> desktop machine in half an hour or so without *too* much trouble.
>
> Some tips:
>
> * If you're declaring queues in bulk, make sure you do so synchonously (i.e.
> nowait=false).

To be perfectly honest, I can't find this (with a relatively cursory
investigation) in the APIs I'm using(Akka 1.2 AMQP module that
leverages RabbitMQ's Java lib iirc).  Having said that, I'm creating
queues one at a time(albeit quickly) - not sure if this addresses the
bulk queue declaration you're referring to?  I'll dig further into the
APIs on my side.

> * If you're going to have that many queues and that few routing keys then
> presumably every message published will be delivered to huge numbers of
> queues. I assume you expect your publishing rate to be small. Publishing a
> couple of messages per second is a lot if they're delivered to 50k queues.

The number of routing keys is actually fairly high given that the
userId and actual message categories are completely variable.  With
the example I gave(~100k queues), each queue is bound to a unique
routing key(topic) and a message typically is delivered to a very
small number of queues(20-30 would be a theoretical maximum and a very
rare situation).  As additional applications are built on top of the
base collection layer and RabbitMQ, the queue count will go much
higher(obviously).

For what it's worth, our testing load(30k "users", 6 queues per user
in my original posting though we never got that high before rabbit
choked) generates an easy 150-200 messages per second going into
RabbitMQ and this can spike higher.  Production will see 200k-300k
users at least I expect.

> * Every queue process consumes a fixed small amount of memory. Spreading
> queues out over a cluster will help.

We're experimenting with a 3-node cluster as of today though I've
reduced the queue count by determining the update type at my
application layer and using topics with wildcards, e.g.
"<source>.<userId>.*"

> * Every queue and binding creates a row in an Mnesia table. Mnesia tables
> always reside in RAM, so for huge numbers of queues in a cluster expect to
> use lots of RAM. Therefore instead of clustering you will probably find it
> better to shard the queues out over independent brokers and connect them
> with federation. The case of huge numbers of queues and very few routing
> keys could almost be made for federation. For the massive fanout case you
> probably want to build a (maybe multi-layered) federation tree rather than
> the complete graph people use more often with federation.

Thanks for suggesting this!  We may go this route but I'm wary of
putting sharding logic in our application layer(perhaps unjustifiably
so).  Truth be told I was kind of hoping for a "silver bullet" with
RabbitMQ ;)  In the meantime, we'll bump the RAM for our cluster nodes
and see what happens.

> * For every active queue (one that has sent or received a message in the
> last few seconds) there is a small CPU cost when using the management
> plugin. With huge numbers of queues per node it may be helpful to disable
> mgmt. On the other hand "rabbitmqctl list_queues" will contact all the
> queues synchronously (maybe waking them up). Monitoring will be an issue.

Good to know, we'll keep that in mind while testing load, etc.

Thanks for the feedback!  Any other ideas based on my above responses
are *most* welcome!

Regards,

Jeremy Pierre