[rabbitmq-discuss] Node scheduling with RabbitMQ

Marek Majkowski majek04 at gmail.com
Wed Aug 11 17:11:29 BST 2010


Hi Pete,

On Mon, Aug 9, 2010 at 03:55, Pete Hunt <floydophone at gmail.com> wrote:
> I was considering writing an x-script exchange for RabbitMQ and
> creating one queue per worker to dump messages in; this is a feasible
> way to go, right?

Well, the question about data locality is pretty tricky to answer.
First. If you have fixed number of workers, than you can just create
two queues for every worker. One queue with requests for 'data to be
prefetched',
another one with real task requests.

But it's more common that you don't know in advance how many workers
will be needed. In such case, I'm sorry, but if doing a task requires
downloading data, than well, it requires downloading data.
If a task is two-stage, and I'm afraid you can't do much
about it. Maybe running more than one worker on a single host, to make
sure that CPU is always saturated, even if another task is busy with
downloading?

> The exchange could probably handle data locality,
> however, what happens when a worker dies with messages sitting on the
> queue?

In AMQP every message needs to be acknowledged before it's removed from
the server. If your worker received a message, and died without acking it,
the message will simply be redelivered to another worker.

> I also am relatively new to the non-Amazon SQS message queuing world,
> so I guess that this becomes less a question of how to do proper data
> locality and more about how to do reliability with RabbitMQ. On SQS,
> due to the visibility timeout semantics I am assured that if I design
> idempotent tasks they will eventually occur even if there is a node
> failure.

Yes, instead of timeouts AMQP/RabbitMQ uses explicit acknowledgments.

> Right now, the solution that makes the most sense to me is to have a
> single consumer that farms the tasks out to the worker nodes, however,
> this means that I need to build reliability capabilities like
> visibility timeouts into the consumer rather than relying on a battle
> tested piece of software like RabbitMQ. Is there some sort of
> functionality in RabbitMQ that accounts for this? Am I missing an
> important piece of documentation somewhere?

Look at the basic.consume examples, see how the acknowledgments are done.
For example:
http://github.com/tonyg/pika/blob/master/examples/demo_receive.py#L74

You also will be interested in the 'qos' command. For example
http://github.com/majek/pika/blob/master/examples/demo_channel_flow_asyncore.py#L55

Cheers,
   Marek


More information about the rabbitmq-discuss mailing list