[rabbitmq-discuss] A weird case of "PRECONDITION_FAILED - unknown delivery tag"
Razvan Dinu
drazvan at gmail.com
Wed Aug 28 13:30:32 BST 2013
Hello guys,
I've been using RabbitMQ for a while, but this time I have an error which I
did not manage to track down: "PRECONDITION_FAILED - unknown delivery tag
XXX" (where XXX varies).
Now, I've done my research and I know it's usually because of double
ack-ing, ack-ing on wrong channels or ack-ing messages that should not be
ack-ed. I've checked at least a dozen times, and it doesn't seem to be the
case for me.
I've simplified the scenario to the minimum required to reproduce the error
(I'm running RabbitMQ 3.0.3 Erlang R15B01; the code is written in python
using pika 0.9.10p0).
I have one working agent which has two threads:
- thread one that fetches tasks continuously and puts them in a list
- a worker thread that continuously takes tasks from the list and performs
them
See code here: http://pastebin.com/PG8quVSw
Here's what happens (see example output here: http://pastebin.com/6f1sWsYa ):
the agent starts by initializing, pre-fetches 10 tasks (the prefech_count
is set to 10) and then tasks are executed one by one. We can see that as
soon as one task is ACK-ed, a new one is fetched so that the prefetch queue
is staying at 10 items. However, as we can see at line 52 in the example
output, the error message (406, 'PRECONDITION_FAILED - unknown delivery tag
12') is returned. The channel is closed, and when the next 10 tasks from
the prefetch queue try to send the ACK they get the error that the channel
is closed. The tasks are prefetched again on the new channel that was
opened automatically and then everything continues fine till the end (there
were 60 tasks in total in the queue).
THE PROBLEM: why did ACK 12 failed?
I managed to reproduce this consistently, but always at at different
message, sometimes 42, 34 etc. One weird thing is that the error about the
unknown delivery tag is not returned on the call to basic_ack (line 13 in
the code), but on the code that consumes the queue (line 45). My guess is
that there's a race somewhere, but can't figure where. In pika maybe? If I
uncomment lines 34 and 35 from the code, it happens a lot less, but it
still happens 1-2 times on 1000 messages. I think it has something to do
with ACK-ing from one thread using the same channel that is used for
listening on another thread. But I see no other way of implementing this
scenario, with an internal pre-fetch queue on the consumer side. And to
answer the question, why do I need this, it's because I need the worker to
be able to take a peak at some of the tasks in the queues.
Any ideas?
Thanks,
Raz
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20130828/f8f0c91a/attachment.htm>
More information about the rabbitmq-discuss
mailing list