[rabbitmq-discuss] High-performance routing strategies

Wed Mar 23 13:11:55 GMT 2011

Hi,

On Mon, Mar 21, 2011 at 10:47:53AM -0600, Helena Edelson wrote:
> > 100,000 queues on a single broker is fine - I've just created that on my
> > desktop with no problems. You could likely get to well over a million
> > with a bit of tuning and enough RAM.
> 
> Can you direct me toward acknowledged, tested RAM requirements? I've seen
> some information pass by in the list and other docs but would like to see
> something concrete.

Erm, we don't really have anything concrete... there are a couple of
issues: one is that Rabbit dynamically pushes messages to disk to free
up RAM, so it's not correct to think that the memory footprint of a
queue is dependent on its contents. The other issue is that after a
queue is idle for a while (10 seconds or more), it'll "hibernate" which
causes a GC of that queue. As a result, this can dramatically shrink the
amount of memory the queue requires. So for example, it might be
possbile to have, say, 1000 empty queues in 10MB of RAM, but only when
they're idle. When they're all active (even if empty), they might,
depending on memory fragmentation and such like, consume much more
memory.

So, some quick testing:

                                             VSZ     RSS (both in kB)
on startup, no queues, no connections:    203092   89492
no queues, 2000 network connections:      265820  146996
 (so we're talking about 32kB per connection)
no queues, close 2000 connections:        263996  145632
 (typical for a VM: it doesn't hand back RAM to the OS immediately)

-> restart rabbit
no queues, 1 connection, 2000 channels:   266052  147000
 (similarly, around 32kB per channel)

-> restart rabbit
1 connection, 1 channel, 2000 queues:     308764 187400
 (a bit more - around 50kB per queue)

Now that's with all processes "active". Them hibernating at this stage
is difficult to observe because the Erlang VM does not hand back memory
to the OS asap. However, it can be observed if you have a large process
that hibernates and has very fragmented memory (the amount reclaimed can
be sufficient to force the VM to hand back memory to the OS). Or if you
have a test that just steadily increases the memory footprint of Rabbit
then the effect of hibernation of idle processes would be to reduce the
rate of increase of memory use.

> Say I have an application consisting of a server to 20,000 agents. Each
> agent-server request/response conversation can be broken down into 15 basic
> classification types with say 5 sub-conversation types. If I leverage topic
> exchanges for their behavioral flexibility, primarily use hierarchical
> routing keys and say, create a topic per major classification/conversation
> and use more specific routing keys for sub-converstations creating more
> lower-cost Bindinds, are you saying that is not more performant than going
> with an exponential amount of direct exchanges and more queues? I can
> partition and load balance the queues that would handle the very large
> message payloads easily enough, but for general high-traffic small JSON
> payloads/quick turn around for instance, what is the recommendation?

The recommendation is to do both and see which one works better for
you! ;) Generally, if you have a finite domain of routing keys, then
many fanout exchanges might be the right thing (literally, one exchange
per routing key). If you have a potentially infinite number of routing
keys, then that won't work, so you'll have to go with topic exchanges.
Quite where direct exchanges fit in here is difficult to pin-point,
except that they're a faster form of topic exchanges provided you don't
need the wild card.

However, as I've said before, it's likely to be binding churn that can
kill you, so if you're creating lots of new queues and bindings to those
queues whenever consumers attach then you might run into problems. Thus
if you have a exchanges 'X1' ... 'Xn' to which lots of msgs are being
published, and then whenever consumer 'Cm' connects, it has to create
bindings from its own queue to all of 'X1' ... 'Xn' then you might have
problems, depending on the rate of connections. One neat solution is for
each consumer to have its own secondary exchange 'Ym'. Then use
exchange-to-exchange bindings to bind all of 'X1' ... 'Xn' to 'Ym'.
These bindings then always exist (make sure 'Ym' is not auto-delete).
Then, whenever consumer 'Cm' connects, all it has to do is create its
queue and bind that queue to 'Ym'. Make 'Ym' a fanout exchange, and you
should find this is very fast, and reduces the binding churn rate to 1
per connection, rather than potentially n per connection.

> With regard to connections and channels, I wonder if someone can point me to
> data or research on thread behavior, leveraging threads going to
> sleep/waking up vs creating/closing connections for each client request to
> the broker.

Erlang is a multithreaded VM, and happily takes advantage of multiple
cores. It presents green threads to the programmer (which are called
'processes' because unlike threads, they conceptually do not share an
address space - the fact that in practise they do is merely an
implementation detail). I'm not quite sure of the research you're asking
for but nothing really springs to mind that I've read recently.

> If I am delegating to a Queue to handle runtime requests to create queues,
> exchanges, and bindings, might that take care of resultant capacity issues?
> I have typically federated JMS brokers and wonder about a similar
> topology/configuration.

I'm afraid I don't really follow: in AMQP it's the channel that provides
functionality such as creating queues, exchanges, bindings et al. Queues
are merely FIFO structures that buffer messages. I suspect my confusion
here is most likely one of terminology.

Matthew