[rabbitmq-discuss] Weird Crash - Recovery logic for durable messages/queues/exchanges?

Darien Kindlund darien at kindlund.com
Fri Aug 7 16:04:11 BST 2009


> Your consumers aren't by any chance containing some reconnecting logic that
> tries to connect to the server again whenever a connection has been dropped?
> I seem to recall reports that, for example, the ruby client does something
> like that. Please check the server log to be sure - it contains a record of
> all connects/disconnects.

Understood.  Yes, I've checked and there were no
connection/re-connection attempts, when I had witnessed the the
persistent messages in the durable queue were still marked as
un-ack'd.

> Once the server has delivered a message to a consumer, it will only become
> available again to other consumers when then recipients channel/connection
> is closed.

Okay.

>> I'm trying to avoid having to shutdown the RabbitMQ server and
>> obliterate the nmesia persister log in order to clear out these
>> messages.
>
> That would remove all messages completely, rather than just making the
> unacknowledged messages available to other consumers again. If the former is
> really what you want then the 'queue.purge' command is your friend.

Okay, so 'queue.purge' will flush all 'ready' and 'un-ack'd' messages
from a particular queue -- or just un-ack'd messages?  Is there a
command in the AMQP spec that will instruct RabbitMQ to re-mark all
un-ack'd messages as ready?  If no such command exists, I'm thinking
it would be useful to include such a command in future versions of the
spec, so that people could develop 'message recovery logic', when
dealing with buggy consumers that are connected but are not actually
properly processing the messages.  I'm guessing your reply would be to
simply forcefully terminate the buggy consumers manually (i'm assuming
there's an AMQP spec command to do this), which would cause RabbitMQ
to re-mark those un-ack'd messages as 'ready'.  However, providing an
explicit command may be helpful for debugging purposes, as I've seemed
to encounter a bug where RabbitMQ didn't perform the re-mark operation
as expected.

If I run into this issue in the future, would it help if I could
provide you a copy of the mnesia directory once RabbitMQ has
unexpectedly crashed?

Thanks again,
-- Darien




More information about the rabbitmq-discuss mailing list