[rabbitmq-discuss] Newbee consulting question

Wed Apr 10 10:06:56 BST 2013

Hi,

On 09/04/13 17:05, Tomer Paz wrote:
> The closest description to our architecture is Amazon SWF. it is ~95%
> the solution concept to our requirements, but we can't use it as it is a
> cloud SaaS and our requirement is 'on-premise' :)

Have you looked at Celery? http://www.celeryproject.org/
However your requirements look sufficiently unique that you will
probably need to create a more specialised workflow manager.

> I was wondering How to manage distributed Tasks where the task
> manager ('orchestrator') is at the data-center, the workers are
> distributed in different physical sites (some on the same site, some
> far in other site separated by WAN), where the connection between the
> datacenter and the sites might be disrupted sometimes.

In general workers connect to the broker via AMQP. If the connection
between the broker and the worker is disrupted then any unacknowledged
messages held by that worker will become available at the broker and
will need to be repeated. If the likelihood of network disconnection is
great then workers should communicate the work completion event via
publishing a new message instead of acknowledging the work message.

> Since the requirement is for WorkFlow management, it means we want tasks
> marked done by a remote worker, will be reported back to the manager, to
> keep the State of all tasks managed in central manner.

Workers can publish as well as consume messages. Workers can publish the
outcome of a piece of work to a dedicated queue used for tracking and
statistics.

> we also need the capability to parallelise 'steps' (that's what
> workflow orchestration is about) within a parent task so that some
> tasks in the workflow tree will be processed consequently, some in 
> parallel.

Work will be executed in parallel if workers consume message with a
small prefetch count, or if they retrieve work synchronously with
basic.get. There are different ways of ensuring that a set of tasks are
performed in sequence - the simplest is to roll up all the details of
the work into a single message. Workers that complete one stage of
processing republish an updated version of the message to the next
stage, possibly using a different queue for each stage.

> Assume workers are not necessarily the same. i.e. there are different
> task "types" thus different workers processing them (hence this is not a
> simple pub/sub pattern either).

You can meet this requirement by routing messages to work queues
according to their type, and having workers of the same type consume
messages from queues containing the appropriate type of message.

> we want the architecture to be "pull" rather than "push" in terms of
> workers 'asking' the workflow manager to get new tasks when they are
> free, rather than task manager pushing tasks to workers blindly.

Workers can retrieve messages from a queue synchronously using the
basic.get method.

-Emile