[rabbitmq-discuss] rabbitmq-discuss Digest, Vol 18, Issue 34

Mon Nov 17 12:11:58 GMT 2008

Ilya

Thanks very much for your input and detail on this.

On Sun, Nov 16, 2008 at 11:59 PM, Ilya Grigorik <ilya at aiderss.com> wrote:
>>
>> * I am assuming flow control does not solve your problem - I think you
>> have made that clear, but can I just check?
>
>
> Yep, in fact, I would much rather not throttle the producers - the queue
> becomes a leaky abstraction.

Good - that is what I thought.

> I like to think of Rabbit (or any other
> router/queue in between) as a buffer, hence it should accept and fan out the
> messages without exposing the lag on the consumer side.

Absolutely.  You'll appreciate of course that when we began the
RabbitMQ project we assumed that the use cases would align well with
those which led to the original AMQP spec.  Of course since then we
have learnt that AMQP really does has have potential for widespread
applicability as a true 'internet protocol'.  And for general purpose
messaging this can include the use case where vast numbers of messages
must be persisted for a long time, because there are no consumers
available.  Although supporting this case is a natural extension of
the broker, let's see if we cannot find a workaround in the meantime.

(Incidentally, at an early stage in AMQP's design it was intended to
include file streaming as a core use case but this has been postponed
at least at the protocol level.)

> I may be conflating two different requirements, but here is an interesting
> scenario to think about: our entire infrastructure is deployed on EC2, and
> time and time again we've found SQS (simple queue service) a life savior for
> big data migrations or general maintenance. We're generating a lot of
> real-time data, and SQS allows us to reroute all of it into a temporary
> queue, and let it accumulate (we've pushed hundreds of GB's of data into it)
> while we do our work, and once we're done, we just flush the queue. No
> additional infrastructure required. (same thing for aggregate jobs, or even
> one-off trial projects)

The SQS use case is interesting and it is one that I am tracking.

I think you have illustrated that you are not conflating requirements
and that the notion of a buffer does not necessarily imply eager
consumption.

> Arguably, we could use a different piece of infrastructure to cover this
> case (another rabbit consumer which stores data into a database), but if
> Rabbit could handle this, life is much easier.

Well - this is where I was going - for a workaround.  I know of people
who have done this and have been trying to tease these use cases into
the public domain and on list..  Bear with us.

> (Wild idea: provide an SQS adapter to reroute packets into at a certain
> threshold.. Perhaps not so wild if we get a pluggable interface)

Interesting, I'd prefer to address your core problem though.

>> * How often do you expect the 250Gb case to happen - do you consider
>> it exceptional or normal?
>
> 250GB is an exception, but smaller amounts (1-5Gb) are fairly frequent
> (which is theoretically possible even with the current pure memory
> deployment).

So for 5Gb where the messages do not all need to be on one queue, you
should be fine with a cluster of 1-4 broker nodes.  Obviously these
would be persisted in the normal way.

For cases where you do not have enough memory: You could use Ben's
producer control flow to deliver an alert when a memory threshold is
breached.  Then you could drain the messages into a data store.  This
is workaround provided ordering and latency issues do not create
bottlenecks or other problems at high scale.

>> * Do you have ordering constraints, and/or must all the messages be on one
>> queue
>
> Depends on the data. Most of it is order agnostic, but there is some data
> which only makes sense when delivered in order.

Ok, then you could use multiple queues and rely on them for ordering.

My general advice to people with cast amounts of data where consumers
may need to recover production order later, is to attach application
level identifiers to each message.  That way you can use as many
queues as you like, in extremis, across any number of nodes and
message storage locations.

>> * In the case where you need to page to disk due to a lack of machine
>> memory, and because your queues are not being drained by consumers,
>> does it matter what the latency of message delivery is once the
>> consumers come back?
>
> No - within reason, of course. ;-)

So, it would be tolerable (performance wise) to retrieve your 250Gb of
messages as a stream, which would be reconstructed from the putative
overflow store.

This is all pending a proper page-to-disk solution of course.

>> The point being that there may be some workaround that you can try,
>> depending on the answer to the above.  You may be able to see where I
>> am going by the last question... ;-)
>
> Absolutely. I'm not saying Rabbit should do all of the things I've described
> either. Of course the more the better, but even more importantly I'd like to
> see some sane overflow conditions: ttl, max queue size, etc. That I can plan
> for, crashing.. won't work. ;-)

Yes I think we are ALL on the same page here  (aargh, sorry ..)

alexis