[rabbitmq-discuss] precondition_failed error with amqp_client for erlang

Max Warnock maxjwarnock at gmail.com
Thu Jun 30 14:23:29 BST 2011


Thanks, that's very helpful from both the possible issues to chase and
sanity check perspectives.

I'm using erlang R13B04 with a rabbitmq server installed via gentoo's
portage at version 2.4.1. I pulled the client library from github (tag
2.3.0, commit: 844738f9b56d34104c1ea2ac5700d0898126c5b4).

I'm going to write some debug code to store all the tags I try to ack on and
see if I can get this error to where it's easily reproducible. Thanks for
narrowing my search, it's very helpful.  I'll keep you updated. I must be
doing something wrong somewhere.  I have a hard time believing such a widely
used library could fail so hard myself.

One thing that would be extremely helpful is if you could point me to some
documentation which I haven't been able to find:  I'm looking for a listing
of all the events/messages that are sent out by the amqp client to a
subscriber.  What does it send when it goes down, what other soft errors
will it send out, etc.  Additionally, is there a doc somewhere for best
practices in connecting a listener to another server/long-running process?
 Not having either of those there has been some struggle to know how to
restart the subscription/listening process if my server dies.  The
amqp_client tutorial has been a great help, but when it comes to error
handling from the listening module perspective it doesn't tell me what the
library is expecting me to do.  I don't want to have to do a bunch of
engineering because I'm square peg, round hole-ing the library.  The primary
issues I'm concerned with are when my server dies hard and is destined to be
restarted by its supervisor what should I send to the amqp client process?
Should I send it close messages and then start a new one? Or should I
reconnect to the client library.  This wouldn't be as big of an issue but I
need to use durable/persistent queues and if I still have a listener hanging
around with the same bindings on the same queue it will eat all my messages
and send them nowhere.

Thanks,
-Max

On Thu, Jun 30, 2011 at 7:48 AM, Matthew Sackman <matthew at rabbitmq.com>wrote:

> Hi Max,
>
> On Wed, Jun 29, 2011 at 06:28:59PM -0400, Max Warnock wrote:
> > I've built a behavior in erlang to subscribe to a given topic exchange
> and
> > farm out message handling.  I'm using the rabbitmq amqp_client library
> for
> > erlang and when I put the system under heavy load I get, on occasion, the
> > following error:
>
> Could you let us know which version of Rabbit, Erlang and the Erlang
> client you're using?
>
> > =ERROR REPORT==== 29-Jun-2011::18:02:18 ===
> > ** Generic server <0.1117.0> terminating
> > ** Last message in was {'$gen_cast',
> >                            {method,
> >                                {'channel.close',406,
> >                                    <<"PRECONDITION_FAILED - unknown
> delivery
> > tag 856">>,
> >                                    60,80},
>
> That's a double-ack (probably). Sadly, the AMQP 0-9-1 spec says that
> acking is not idempotent, thus it's a fault to ack the same message
> multiple times...
>
> > The server receive loop where the ack happens looks like this:
> > receive
> > ...
> > {#'basic.deliver'{delivery_tag = Tag, routing_key = RoutingKey},
> > #amqp_msg{payload = Payload}} ->
> >     amqp_channel:cast(get(amqp_channel_pid), #'basic.ack'{delivery_tag =
> > Tag}),
> >     spawn_and_queue(spawn_handle_message, Module, RoutingKey, Payload),
> >     loop(Module);
> > ...
> > end
>
> ...hmmm, which is so simple that I can't see how it could go wrong: if
> you're not double acking then something else must be going on to make
> the broker think that it's not expecting an ack for that message, hence
> the error. If you're doing some sort of reject operation - either
> basic.nack or basic.reject on messages and you then subsequently ack one
> of those messages then that would also cause this error. There may be
> other cases as well.
>
> > The amqp_client_sup can't seem to bring back the the client either and
> dies
> > from the retry intensity being reached.  I've done a hefty amount of
> > googling and can't seem to find where things could be going wrong.
>  Before
> > jumping into the amqp_client code I thought I'd ask the mailing list if
> they
> > have any ideas.  The only thing I can think is that there is a race
> > condition within the client library.  I will be double checking my code
> to
> > be sure it isn't sending the ack twice, but given the simplicity of the
> ack
> > the only way it could is if it receives the same message (with identical
> > delivery tag) from the amqp_client library twice.
>
> It could be a bug in the client library, but I'd be a little surprised
> if we're managing to duplicate messages somehow - that would be a new
> level of fail for us. ;) However, the fact that the entire connection
> dies is alarming and almost certainly a bug: PRECONDITION_FAILED is a
> soft error and should only tear down the channel, not the whole
> connection. After that, all you should have to do is create a new
> channel and everything else should be ok. If that's not the case please
> let us know.
>
> Best wishes,
>
> Matthew
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20110630/ef964c26/attachment.htm>


More information about the rabbitmq-discuss mailing list