[rabbitmq-discuss] Topic routing

Mon Oct 20 21:37:37 BST 2008

Brian,

On Mon, Oct 20, 2008 at 6:00 PM, Brian Sullivan <bsullivan at lindenlab.com> wrote:
> The former.  We want to have a routing key "<domain>.<event>", so that we
> can easily subscribe to the same event across all domains.  You could think
> of it like trying to listen to events that are happening across a number of
> servers.  Same set of event types, but N servers, which we might like to see
> individually, or all together in a subscription.  If we did it with separate
> topic exchanges, we'd need to know the full list of them and subscribe to
> each individually - we'd like to keep it more dynamic than that.

Ok, this sounds like an ideal candidate for topic exchanges from an
application perspective.

> Is there a way to speed up this draining process, in case there is a
> particularly high volume feed that has gotten out of control and we don't
> want it to impact more critical feeds?

The way to deal with high volume ingress is the Channel.Flow command.
Please refer to this thread:

http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/2008-October/002132.html

I've just checked the timestamp of that thread - it is 3 minutes
earlier that this post, hence you probably did not have a chance to
read it before you wrote this.

> My plan has been to route the
> heaviest feeds to particular nodes on the cluster, so that they can manage
> that busy queue there.  Our publishers are sending all event types, so the
> heavier topics are not yet separated out (hence the appeal of a message
> queue routing system).

Clustering will not affect the run time complexity of routing, since
the routing table has to be coherent across the cluster.

In a clustered scenario, the queues may reside on different nodes,
whereas exchanges will *exist* on every node.

> Also, from what I can see in testing, the overhead of routing messages to
> nodes in the cluster that have active subscriptions is pretty low.  I know I
> can throw more consuming applications to pull off a single queue on that
> node to keep up with the flow.  However, there's got to be a saturation
> point where that node cannot manage the volume of all messages with that
> topic key going to one place.

I think that you might be touching on the subject that would make AMQP
truly globally scalable - eventual consistency.

With eventual consistency, you *could* relax the constraint of having
to have a single gloabally consistent queue endpoint and allow
multiple queues with the same name. If you were to do this, you'd have
to build in the notions of merging and other conflicts that may arise
into the AMQP model.

Until this happens, you are stuck with the global consistency of AMQP.

>From a practical perspective, you have to ask yourself, when I am
likely to run into that saturation point.

If you think that may happen at some stage soon, you may well have to
reconsider using RabbitMQ.

Might be worth just calibrating this.

> Is there any way to scale this without
> changing my routing keys?

Ok, so the general advice we give is to try to exploit some kind of
partition that is natural to your application  - not a very original
idea, I know and doesn't *actually* answer your question :-(

> The only way I can think of managing that limit
> at this point is to start generating smarter keys so that I can split the
> load by key to different nodes (for example "mytopickey.1", "mytopickey.2").
>  Is there something cleaner (non-data-related) than that?

No and yes.

No in the sense that there is nothing OOTB that I know of that will
solve this issue *without* you getting funky with your routing keys.

Yes in the sense that you could implement a custom exchange type that
allows you to continue with your current naming scheme and the
exchange is smart enough to perform some kind of partitioned routing
that would facilitate very high volume routing.

> The only "issue" I have been able having is theoretical, since I expected
> the SpammyTopicProducer sample code to sustain some max throughput without
> any subscribers.  I have made a quick modification to publish on a constant
> topic (to make sure it wasn't due to a routing table issue with millions of
> topics) and I still saw the same behavior (where it seems the broker doesn't
> throw away the messages fast enough).

Yes, the solution for this is the Channel.Flow patch (as the Linux
team say - crashing a kernel sometime near you soon :-)

Ben