[rabbitmq-discuss] Cluster Pathology

Fri Feb 13 16:57:32 GMT 2009

Drew,

On Fri, Feb 13, 2009 at 3:34 PM, Drew Smathers <drew.smathers at gmail.com> wrote:
> Thanks for the information.  The consumer is not as much my concern as
> the publisher (also attached to A) who would continue publishing
> messages which should be delivered but get discarded. (Btw, it's still
> _very_ unclear to me who to get notification that a message cannot be
> routed.)

IMHO this shouldn't be the producer because this otherwise couples
production and consumption, which is what the messaging broker is
supposed to avoid. I think it should be something else that is
montoring the presence of a queue.

> I'm solving this issue by making publisher attach to only
> one node where the queue is defined so socket errors would stop the
> publisher; this is appropriate for our system where there are very few
> publishers but many consumers.  We're also keeping a rotating log for
> critical messages as another point of recovery in the event a
> persister log cannot be recovered.

This would be a slightly different issue though. Are you saying that
you are having troubles replaying the log?

> I haven't finalized what to do
> from the consumers' perspective except perhaps having some activity
> monitor with a timeout to trigger reestablishing a channel, queue
> declarations, etc.  Any ideas how to best handle this without to many
> complications such as AMQP-level events, etc?

ATM you can't do anything because AMQP level events are currently
vaporware :-) The idea behind this is an event system that provides
the necessary primitives for userland code to be able to extend broker
functionality without having to bloat the core. So right now, your
only options are to either build in a supervisor into the broker
(which requires you to hack on the broker), or to do some kind of
external polling.

>> Obviously it would be nice to have better handling for this kind of
>> thing, which will probably happen at some stage.
>>
>
> Yes, please :)  If there are significant performance impacts, I think
> it would still be to nice to have as an optional runtime configuration
> for applications where "(99.99999%) guaranteed delivery" is a
> requirement more than overall throughput.

I'll take this down as something to consider doing. No ETA on this, of course.

Ben