[rabbitmq-discuss] Fwd: Sequential processing for groups of messages with a multi-consumer setup

Tue Jun 26 18:23:33 BST 2012

Oops - sent this to that pesky google group by mistake.

Begin forwarded message:

> From: Tim Watson <watson.timothy at gmail.com>
> Date: 26 June 2012 09:45:37 GMT+01:00
> To: Ed Levin <ed at cloudservicesdepot.com>
> Cc: "rabbitmq-discuss at googlegroups.com" <rabbitmq-discuss at googlegroups.com>
> Subject: Re: [rabbitmq-discuss] Sequential processing for groups of messages with a multi-consumer setup
> 
> You might want to take a look at Matthew Sackman's consistent hash exchange. Providing you use the ticket ID as the routing key, this can ensure that all messages with the same routing key will end up on the same exchange. You could then create a queue and bind it to the exchange, adding a single consumer. If only one message per unique ticket ID is allowed to proceed at any given time, you'll ensure the semantics that way, but of course if processing takes a long time, you could end up with a backlog on the queue reducing performance.
> 
> Another approach would be to have the single consumer drain the queue continuously an hand off messages immediately to a worker in your application. Protect that worker with a blocking queue (or whatever barrier technique your chosen platform supports) and it'll process tasks sequentially, but the consumption of messages from the rabbit queue won't be held up.
> 
> You assert below that having a single consumer per queue doesn't scale, but AFAICT no matter how much acrobatics you perform, you can't get away from the fact that each ticket ID has to be handled sequentially. The parallelism you want therefore, lies in consuming messages as they arrive without blocking delivery of new ones. A consumer per ticket ID makes sense in that situation (I would've thought) and  pairing the consumer thread with a worker (thread) means you don't have to block incoming messages.
> 
> In that scenario, you're using rabbit's consistent hash exchange to make sure you know where a particular ticket ID will end up, using temporary queues (bound to the exchange) to allow a consumer per ticket ID and handing the concurrency issue where it belongs - in the application logic.
> 
> With this kind of design, you would of course have to deal with creating new exchanges and binding them to ticket IDs whenever you do not have an existing binding in place for them already. This will require a control channel, which could take advantage of a default exchange. You might also want to think about housekeeping for ticket IDs that are no longer active (perhaps after some period of time). You'll potentially want to shut down consumers once they've been inactive for some period of time, closing the queue and removing the exchange. The consumer is really in the best position to handle this, as it can easily keep a timer or do non blocking reads.
> 
> The advantage of the consistent hash exchange in this architecture, is that it will cope with the changing schematics of exchanges and bindings without requiring explicit reconfiguration all the time, although there may be some scenarios in which explicit application intervention would make sense.
> 
> HTH. 
> 
> Tim
> 
> On 26 Jun 2012, at 03:30, Ed Levin <ed at cloudservicesdepot.com> wrote:
> 
>> Hi,
>> 
>> I am considering RabbitMQ for a project with the following requirement and wanted to get some feedback since I am relatively new to message queuing:
>> 
>> To start, I have a queue with a number of consumers assigned for concurrent processing. The messages published to a queue will represent tickets coming from a ticketing system and some may be associated with the same ticket (e.g. same ticket number on different messages in the queue). The problem is that I can only consume (work on) a single message belonging to a given ticket at a time to avoid external concurrency issues. In other words, if consumer A is working on a message representing ticket 1 and, at the same time, consumer B is available and is assigned the next message from the queue that also represents ticket 1, we have a problem.
>> 
>> So far I am considering two possible options:
>> 
>> 1. Dynamically create a temp queue per new unique ticket that arrives, and in effect sort messages into their own queues by ticket number. The problem with this (if even possible) is how to manage consumers for the temp queues. If I assign a single consumer to each queue, the system does not scale, if I assign a consumer to multiple queues I am back to the original problem.
>> 
>> 2.Keep a single queue with multiple consumers assigned and implement a cache to track ticket numbers currently being processed. When a consumer receives a new message, it first checks the cache to see if a given ticket is already being processed. If so, re-queue the message, otherwise add to cache.
>> 
>> I am leaning towards option 2 but appreciate any feedback and wisdom from the community (perhaps I am missing something obvious).
>> 
>> Thank you,
>> 
>> -Ed
>> _______________________________________________
>> rabbitmq-discuss mailing list
>> rabbitmq-discuss at lists.rabbitmq.com
>> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20120626/1982b588/attachment.htm>