[rabbitmq-discuss] Topic routing

Wed Oct 22 23:15:48 BST 2008

Brian,

On Wed, Oct 22, 2008 at 7:52 PM, Brian Sullivan <bsullivan at lindenlab.com> wrote:
> I mean 3) number of actual topic keys that I publish on - if I need to add
> some variables to the keys so I can distribute load across multiple queues
> ("mytopic.1", "mytopic.2").  Some ideas I had here were a little funky and
> might cause a multiplying of actual topic key strings that I use.  One idea
> was to use the publisher's hostname in the topic key (ex: "mytopic.host123")
> so that I could create a binding to "mytopic.host*3" to get ~1/10th the load
> on each queue (however now that I think about it, I don't know if I can do
> that with the wildcards anyways - I think * matches only full words in the
> key, doesn't it?).

Well, the O(n) I refer to is not option (3). The cost of routing a
message on a topic exchange is linear in the number of bindings,
irrespective of how many different routing keys you use.

Whether foo.*.[0-9] delivers true load balancing will depend on the
relative frequency of each individual key on one side.

On the other side, it will also depend on how homogeneous the capacity
of consumption is on each queue - the more heterogeneous this is, the
less load balancing you will experience.

Though you could also imagine a situation whereby skew in the key
distribution is counteracted by heterogeneous queue throughput, thus
bringing balance into the system by accident.

And you can imagine the opposite, when a skew distribution is
amplified by unbalanced queues.

I don't know whether or not implementing load balancing via topic
matches will buy you much in terms of load balancing, so my advice is
to try it out and measure it.

The reason why I say this is because to answer this with any certain
authority you would have to have intimate knowledge of how things get
pushed through the entire processing pipeline.

As far as I can see you would definitely be increasing the effective
workload on ingress - proportional to the degree that you want to load
balance.

On egress, I think the only gain you would see is when a queue cannot
farm work out quick enough to the consumers listening on it, which, if
you had QoS, would be less likely to happen.

But having said that

a) you mileage may vary;
b) you could use a bounded number of intelligently chosen wildcard
patterns (using a * for the last character will give you a load
balance of 1/n(charset) for randomly distributed keys);
c) topic caching is on the road map, persuasiveness has been known to
lead to re-prioritization;
d) one could implement a custom exchange type that is specifically
designed to do load balancing;
e) you might find that you don't need to do this yet before your real
needs aren't anywhere near the limits (but due diligence is always
good).

BTW: * matches a single word, and # matches zero or more words.

> Regardless, as long as a binding to "mytopic.*" counts as one binding in
> your O(n), even if I have M topics (mytopic.1, ... mytopic.m), then I think
> I'm good with whatever oddball scheme I come up with.

As indicated above, this is correct way to view the O(n) - the number
of bindings.

HTH,

Ben