[rabbitmq-discuss] Topic routing

Wed Oct 22 23:23:15 BST 2008

Agreed on all of your points.  I am less worried about this being a  
showstopper for us now - I just want to have an idea of what options I  
might have to relieve pressure in this scenario.  Thanks for the  
thorough reply.  :)

Brian

On Oct 22, 2008, at 3:15 PM, Ben Hood wrote:

> Brian,
>
> On Wed, Oct 22, 2008 at 7:52 PM, Brian Sullivan <bsullivan at lindenlab.com 
> > wrote:
>> I mean 3) number of actual topic keys that I publish on - if I need  
>> to add
>> some variables to the keys so I can distribute load across multiple  
>> queues
>> ("mytopic.1", "mytopic.2").  Some ideas I had here were a little  
>> funky and
>> might cause a multiplying of actual topic key strings that I use.   
>> One idea
>> was to use the publisher's hostname in the topic key (ex:  
>> "mytopic.host123")
>> so that I could create a binding to "mytopic.host*3" to get ~1/10th  
>> the load
>> on each queue (however now that I think about it, I don't know if I  
>> can do
>> that with the wildcards anyways - I think * matches only full words  
>> in the
>> key, doesn't it?).
>
> Well, the O(n) I refer to is not option (3). The cost of routing a
> message on a topic exchange is linear in the number of bindings,
> irrespective of how many different routing keys you use.
>
> Whether foo.*.[0-9] delivers true load balancing will depend on the
> relative frequency of each individual key on one side.
>
> On the other side, it will also depend on how homogeneous the capacity
> of consumption is on each queue - the more heterogeneous this is, the
> less load balancing you will experience.
>
> Though you could also imagine a situation whereby skew in the key
> distribution is counteracted by heterogeneous queue throughput, thus
> bringing balance into the system by accident.
>
> And you can imagine the opposite, when a skew distribution is
> amplified by unbalanced queues.
>
> I don't know whether or not implementing load balancing via topic
> matches will buy you much in terms of load balancing, so my advice is
> to try it out and measure it.
>
> The reason why I say this is because to answer this with any certain
> authority you would have to have intimate knowledge of how things get
> pushed through the entire processing pipeline.
>
> As far as I can see you would definitely be increasing the effective
> workload on ingress - proportional to the degree that you want to load
> balance.
>
> On egress, I think the only gain you would see is when a queue cannot
> farm work out quick enough to the consumers listening on it, which, if
> you had QoS, would be less likely to happen.
>
> But having said that
>
> a) you mileage may vary;
> b) you could use a bounded number of intelligently chosen wildcard
> patterns (using a * for the last character will give you a load
> balance of 1/n(charset) for randomly distributed keys);
> c) topic caching is on the road map, persuasiveness has been known to
> lead to re-prioritization;
> d) one could implement a custom exchange type that is specifically
> designed to do load balancing;
> e) you might find that you don't need to do this yet before your real
> needs aren't anywhere near the limits (but due diligence is always
> good).
>
> BTW: * matches a single word, and # matches zero or more words.
>
>
>> Regardless, as long as a binding to "mytopic.*" counts as one  
>> binding in
>> your O(n), even if I have M topics (mytopic.1, ... mytopic.m), then  
>> I think
>> I'm good with whatever oddball scheme I come up with.
>
> As indicated above, this is correct way to view the O(n) - the number
> of bindings.
>
> HTH,
>
> Ben