[rabbitmq-discuss] Silent crash causes persistent durable message loss

Francesco Mazzoli francesco at rabbitmq.com
Mon May 21 21:50:31 BST 2012


Hi Will,

I have trouble believing that it is actually dying silently silently 
with no information in the logs. Can you try to reproduce it and post 
the logs here?

In the meantime I'm going to do the obvious and suggest to upgrade to 
2.8.2. We fixed several ugly bugs related to DLX (one of which was 
particularly easy to get) and they might be related to your problem.

Francesco.

On 21/05/12 20:22, Will Koffel wrote:
> How's that for a useless mysterious title? A bit more on what we're seeing:
>
> I'm running 2.8.1, and I have one queue in our setup with a long TTL
> ("expiring-queue", we'll call it), which then uses dead-letter-exchange
> to reroute to another queue ("action-queue"). The TTL for these expiring
> messages is 7 days. So it looks like this:
>
> [ec2-user at web03 current]$ sudo rabbitmqctl list_queues name messages
> arguments durable
> expiring-queue311443[{"x-dead-letter-exchange","my-exchange"},{"x-message-ttl",604800000},{"x-dead-letter-routing-key","action-queue"}]true
> action-queue0[]true
>
> (BTW, What I'm doing is using the TTL as a way to keep track of an event
> that expires after one week. Namely, we keep a count of a particular
> event for the last 7 days. So each time the event happens we write to
> the action-queue which increments the value, and to the expiring queue.
> 7 days later, that message gets expired into the action-queue again, and
> we decrement the counter. So we have a real-time, running 7 day counter.)
>
> This for the most part is stable. Except when it's not. We've seen 3 or
> 4 crashes of this system since we set it up 3 weeks ago. I can't find
> any information in the logs to tell me why Rabbit crashed, it just dies
> silently. But more distributing is that when I bring it back up, all the
> messages (typically millions) in the expiring-queue are gone. That's
> death for me, because that's the only record of when those things are
> supposed to expire.
>
> Any leads on where to look for more crash reports or evidence of what's
> happening here? And importantly, in what case would these messages be
> lost (seems like that should never happen!) The queue is durable, and
> I'm using deliveryMode=2 for the messages. I'm pretty sure that
> persistence works in general, because I can stop rabbit and restart it
> and all the messages are still there...they are only lost in the case of
> this odd silent crash. I've also tried doing evil things like kill -9
> various server processes, and haven't been able to reproduce the message
> loss in any controlled environment.
>
> I'm LOVING that this could work in Rabbit with the 2.8 updates, hoping
> to not have to move to another queueing system where I have to build all
> the dead-letter-routing stuff myself, when this is so close to a clean
> solution.
>
> Thanks in advance for any thoughts.
>
> -Will
>
>
>
>
> ________________
> Will Koffel
> CTO, Thumb™
> 51 E 12th St., 4th Floor
> New York, NY 10003
> Office: (212) 673-8650
> Mobile: (617) 575-WILL
> @thumb
> www.thumb.it <http://www.thumb.it/>
>
>
>
>
>
>
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss



More information about the rabbitmq-discuss mailing list