[rabbitmq-discuss] Weird Crash - Recovery logic for durable messages/queues/exchanges?

Darien Kindlund darien at kindlund.com
Fri Aug 7 04:12:49 BST 2009


So after running RabbitMQ v1.6.0 for awhile, I've encountered a
strange crash, where the server unexpectedly dies with no crash report
or any applicable log information written to disk.  I'm trying to see
if I can replicate the issue, but in the meantime, when I recover the
server, it dutifully recovers all my messages, queues, exchanges, and
bindings (great!).  However, once the server recovers, all the durable
messages in the queues (from rabbitmqctl) are still marked as
*messages_unacknowledged" -- not "messages_ready"... To my knowledge,
this means: "RabbitMQ thinks there is already an AMQP channel and
connection open which already has these messages -- and is simply
waiting for an ACK back from this AMQP consumer."  ... The problem is:
when RabbitMQ recovers, all AMQP channels/connections are terminated,
so this assumption is clearly wrong (in this scenario).

I'm wondering if the "persister recovery" was designed so that when
RabbitMQ attempts to re-start up and sees that there was a crash -- if
it has built-in logic to essentially "reset" all durable messages in
the queues back to a "ready" state -- so that a new AMQP consumer can
process them.

So, my questions are:

1) Does such "persister recovery" logic exist? Or not?
2) Is there any recommended _manual_ way (via the erl shell?) to
perform this "reset"? ... or do I simply have to obliterate my
persister log and regenerate all my queues, exchanges, and bindings?

Regards,
-- Darien




More information about the rabbitmq-discuss mailing list