[rabbitmq-discuss] mitigating broker and consumer outages
mikeb at rabbitmq.com
Wed Sep 15 12:55:02 BST 2010
>> 1. I've noticed that all consumers connected to a broker (via Spring
>> AMQP) stop whenever the broker is stopped. Is this behavior expected?
They will stop in the sense that when the connection to the broker is
dropped, it is not automatically picked up again (well I don't think so,
I've not looked at spring-amqp specifically).
Automatic reconnection is something that could certainly be built into
clients (but isn't, in the case of our Java client) and applications,
but it's not without difficulty -- see below.
>> 2. What's the best practice for doing a complete reset (restarting the
>> broker and consumers)?
I don't know if there's a commonly understood best practice for how to
restart consumers (especially lots of consumers) -- perhaps people on
the list have tactics that work well for them?
Very broadly, I would expect that an orderly reset is best accomplished
by stopping producers, then consumers, then the broker; and the reverse
for starting up.
With regard to your application, if you are reconnecting, your
application must be prepared to be delivered messages that it's already
seen. The reason is that messages that weren't acknowledged will be
returned to the queue on channel close, and redelivered when there are
In the case of multiple consumers (e.g., if you are using a shared queue
to load-balance work items), this also means that messages may not be
distributed among consumers in the same way as before.
Usually this means making work items either transient (in which case,
you use no_ack=true when consuming) or idempotent (e.g., by having a
In the pub-sub case you probably either don't care if you're delivered
messages more than once anyway, or are happy to miss messages.
>> 3. In a system with hundreds of consumers on dozens of machines, it's
>> impractical to restart them by hand. Is there anything out of the box which
>> can facilitate reconnecting consumers to a broker?
Rabbit will try very hard to shut connections down in an orderly fashion
when you stop it. For that reason, you can generally rely on having a
known state in your application. Hence it is reasonable to expect to
simply respond to the AMQP connection closing by throwing away work in
progress, and trying to reconnect (with a backoff). But, as above, the
details will depend very much on your application.
More information about the rabbitmq-discuss