[rabbitmq-discuss] routing threads on a rabbitmq node

Thu Feb 4 11:39:29 GMT 2010

Hi Brian,

On Wed, Feb 03, 2010 at 12:05:23PM -0800, Brian Sullivan wrote:
> We have ~75 bindings, same as the number of queues.  We don't do many
> multiple bindings per queue (if any).  This has increased faster than our
> message volumes (more consuming applications to make use of the data), so I
> believe this is the primary reason things are harder now than they used to
> be.

If you pretty have every message going to every queue, it may be much
simpler for you to use a fanout and then drop messages at the consumer.
However, we're all in agreement that the use of topic exchanges here
isn't likely to be the problem.

> What I would like to figure out is how to reorient my cluster to make things
> more stable.  Knowing that the routing time is increasing due to the number
> of bindings, I am not convinced that my plan of adding a rabbitmq node to
> each producer is going to make things all that much better - the routing
> table will still be the same, and it will need to do that cross-routing
> you're talking about avoiding.

What I would recommend is to use the recently announced shovel. Have one
node, which the publishers send to. They send to a fanout exchange. You
then have some leaf nodes, which run the shovel. The shovel connects to
the central node, creates a queue and binds to the fanout exchange, and
republishes messages to a topic exchange on the leaf nodes.

You then split your various other queues over the exchanges on the leaf
nodes, thus dividing the outbound rate over the various leaf nodes.

The only thing that changes is that you need to somehow load balance
your consumers so that they know which leaf nodes to connect to. All the
leaf nodes would receive the same messages so there's no issue about
only being able to connect to certain nodes, but you do want to spread
the load evenly.

This would avoid using a cluster, and has the further advantage that as
your load grows, you can add further leaf nodes to share the load
seemlessly, without taking anything down.

> Even when we have a single producer catching
> up in our current system, the node can only route at a certain rate, and
> this is definitely not CPU bound.  I am curious why Erlang cannot spend more
> time in that thread, but I don't know much about it - does that seem right
> to you?

That is interesting. Did you mean "consumer" rather than "producer" at
the top there? Assuming you did, there could be a few reasons:

1. The client itself could be the bottle neck. In the absence of a QoS
setting, Rabbit will send messages to the consumer as fast as possible.
These messages arriving at the consumer obviously take up some CPU
resources to take them off the wire. Thus setting a QoS can limit the
loading on the consumer. However, setting it too low (eg 1) can mean
that the consumer is waiting for a little while after sending back an
ack before the next message arrives. Some basic tuning may be useful
here, depending on the structure of your clients (eg are they internally
multithreaded etc).

2. TCP Buffers on client and RabbitMQ. There have been a couple of threads
recently on this list about buffer sizes. You may wish to try increasing
the TCP buffers of RabbitMQ so that it can load more data into the
buffers and pass it off to the network. You might wish to measure the
amount of network throughput you're seeing.

3. If QoS is off, and a queue has grown to a good length, then it's
possible for acks to be "stalled" whilst the queue tries to push
messages as fast as possible to the consumer. A build up of acks can
hurt throughput. This has been fixed in 1.7.1. Now given that you're
saying the RabbitMQ node doesn't seem to be CPU bound here, I don't
think this is it, but I'd still suggest trying 1.7.1 when you can.

> I am not sure what I can do to minimize cross-routing, other than to try to
> keep our producers consolidated and keep the heaviest consumers (meaning the
> ones with a binding to the most active topics - remember that all queues
> bind to only one topic expression) separated on their own nodes, to remove
> their queue management processing on the core routing function.  Ironically,
> I was originally trying to keep the heaviest consumers on the routing nodes,
> to minimize forwarding of messages - but if the cost magnifies with the
> number of consumer queues, then it's likely that keeping the larger fanout
> (but smaller throughput) of consumers on the routing nodes might be best.

With the design I propose above, without the cluster, but with several
leaf nodes, I would suggest that you try to ensure the most active
queues are evenly distributed. across the array of leaf nodes.

> The thing that concerns me is that my scalability here seems to be limited -
> the only other thing I can think of doing is increasing my number of
> producers to distribute the load even further and possibly do the local node
> thing - then if our routing table keeps growing, I can manage scaling at the
> producer level - not efficient maybe, but at least it can grow past the
> threshold I appear to be running into.

Using the shovel and spreading out load to a number of leaf nodes (and
this hierarchy can be several layers deep if necessary) reduces the
amount of fanout on each node, and shares out the amount of data each
node needs to send out. This is more manual and involved, but more
efficient than a cluster.

Please let us know how you get on.

Matthew