[rabbitmq-discuss] precondition_failed error with amqp_client for erlang
Alexandru Scvorţov
alexandru at rabbitmq.com
Mon Jul 4 17:16:43 BST 2011
Hi Max,
I'm trying to run through the steps you provided, but I'm having a bit
of trouble following.
Are you using a network or a direct connection? (I assume network, but
it probably doesn't matter)
By server, do you mean the actual RabbitMQ server, or you application?
(I'm guessing your long-running application)
By subscribe, do you mean calling amqp_channel:subscribe/3? If so, do
you still need a list of the messages the channel may send its
subscriber?
Or do you mean that your application is sending messages to its
subscribers?
> 6.) The server supervisor restarts the server which creates a new listener,
> but the old listener is still hanging around trying to send the the
> registered name
What's a listener? Is it a process that receives messages from the
erlang client because it's the endpoint of a subscription to a queue?
Can't you link listeners to the server so that when the server goes
down, it takes the listeners with it?
> So my question then is how should I kill the amqp_client?
What do you mean by amqp_client? If it's an amqp_connection process,
you can just send it an shutdown exit.
Thanks for the information.
Cheers,
Alex
On Fri, Jul 01, 2011 at 01:51:10PM -0400, Max Warnock wrote:
> Problem found. Thanks for your help. The problem is a strange one and has
> to do with me not shutting my amqp_client listener down properly if my
> server dies. Here is how it manifests:
>
> 1.) Server starts up and starts up a amqp client connection and channel
> 2.) The server binds to that channel and starts the subscription using a
> registered name name as the process to which messages will be sent
> 3.) Messages start coming in and are ack-ing fine
> 4.) Poor error handling in farming out processes brings the server down
> 5.) The server does no close the amqp_client connection
> 6.) The server supervisor restarts the server which creates a new listener,
> but the old listener is still hanging around trying to send the the
> registered name
> 7.) The older listener sends a message to the server
> 8.) The server tries to ack to the new listener which did not send the
> message
> 9.) The new server pukes because it never sent a message with that tag
>
> So my question then is how should I kill the amqp_client? If I send it an
> exit its supervisor will restart it. This is what I was getting at with my
> tangential questions in the last email. How should I shut down the
> amqp_client without shutting down all the other servers' amqp client
> listeners?
>
> Thanks for all the help,
> -Max
>
> On Thu, Jun 30, 2011 at 9:23 AM, Max Warnock <maxjwarnock at gmail.com> wrote:
>
> > Thanks, that's very helpful from both the possible issues to chase and
> > sanity check perspectives.
> >
> > I'm using erlang R13B04 with a rabbitmq server installed via gentoo's
> > portage at version 2.4.1. I pulled the client library from github (tag
> > 2.3.0, commit: 844738f9b56d34104c1ea2ac5700d0898126c5b4).
> >
> > I'm going to write some debug code to store all the tags I try to ack on
> > and see if I can get this error to where it's easily reproducible. Thanks
> > for narrowing my search, it's very helpful. I'll keep you updated. I must
> > be doing something wrong somewhere. I have a hard time believing such a
> > widely used library could fail so hard myself.
> >
> > One thing that would be extremely helpful is if you could point me to some
> > documentation which I haven't been able to find: I'm looking for a listing
> > of all the events/messages that are sent out by the amqp client to a
> > subscriber. What does it send when it goes down, what other soft errors
> > will it send out, etc. Additionally, is there a doc somewhere for best
> > practices in connecting a listener to another server/long-running process?
> > Not having either of those there has been some struggle to know how to
> > restart the subscription/listening process if my server dies. The
> > amqp_client tutorial has been a great help, but when it comes to error
> > handling from the listening module perspective it doesn't tell me what the
> > library is expecting me to do. I don't want to have to do a bunch of
> > engineering because I'm square peg, round hole-ing the library. The primary
> > issues I'm concerned with are when my server dies hard and is destined to be
> > restarted by its supervisor what should I send to the amqp client process?
> > Should I send it close messages and then start a new one? Or should I
> > reconnect to the client library. This wouldn't be as big of an issue but I
> > need to use durable/persistent queues and if I still have a listener hanging
> > around with the same bindings on the same queue it will eat all my messages
> > and send them nowhere.
> >
> > Thanks,
> > -Max
> >
> > On Thu, Jun 30, 2011 at 7:48 AM, Matthew Sackman <matthew at rabbitmq.com>wrote:
> >
> >> Hi Max,
> >>
> >> On Wed, Jun 29, 2011 at 06:28:59PM -0400, Max Warnock wrote:
> >> > I've built a behavior in erlang to subscribe to a given topic exchange
> >> and
> >> > farm out message handling. I'm using the rabbitmq amqp_client library
> >> for
> >> > erlang and when I put the system under heavy load I get, on occasion,
> >> the
> >> > following error:
> >>
> >> Could you let us know which version of Rabbit, Erlang and the Erlang
> >> client you're using?
> >>
> >> > =ERROR REPORT==== 29-Jun-2011::18:02:18 ===
> >> > ** Generic server <0.1117.0> terminating
> >> > ** Last message in was {'$gen_cast',
> >> > {method,
> >> > {'channel.close',406,
> >> > <<"PRECONDITION_FAILED - unknown
> >> delivery
> >> > tag 856">>,
> >> > 60,80},
> >>
> >> That's a double-ack (probably). Sadly, the AMQP 0-9-1 spec says that
> >> acking is not idempotent, thus it's a fault to ack the same message
> >> multiple times...
> >>
> >> > The server receive loop where the ack happens looks like this:
> >> > receive
> >> > ...
> >> > {#'basic.deliver'{delivery_tag = Tag, routing_key = RoutingKey},
> >> > #amqp_msg{payload = Payload}} ->
> >> > amqp_channel:cast(get(amqp_channel_pid), #'basic.ack'{delivery_tag =
> >> > Tag}),
> >> > spawn_and_queue(spawn_handle_message, Module, RoutingKey, Payload),
> >> > loop(Module);
> >> > ...
> >> > end
> >>
> >> ...hmmm, which is so simple that I can't see how it could go wrong: if
> >> you're not double acking then something else must be going on to make
> >> the broker think that it's not expecting an ack for that message, hence
> >> the error. If you're doing some sort of reject operation - either
> >> basic.nack or basic.reject on messages and you then subsequently ack one
> >> of those messages then that would also cause this error. There may be
> >> other cases as well.
> >>
> >> > The amqp_client_sup can't seem to bring back the the client either and
> >> dies
> >> > from the retry intensity being reached. I've done a hefty amount of
> >> > googling and can't seem to find where things could be going wrong.
> >> Before
> >> > jumping into the amqp_client code I thought I'd ask the mailing list if
> >> they
> >> > have any ideas. The only thing I can think is that there is a race
> >> > condition within the client library. I will be double checking my code
> >> to
> >> > be sure it isn't sending the ack twice, but given the simplicity of the
> >> ack
> >> > the only way it could is if it receives the same message (with identical
> >> > delivery tag) from the amqp_client library twice.
> >>
> >> It could be a bug in the client library, but I'd be a little surprised
> >> if we're managing to duplicate messages somehow - that would be a new
> >> level of fail for us. ;) However, the fact that the entire connection
> >> dies is alarming and almost certainly a bug: PRECONDITION_FAILED is a
> >> soft error and should only tear down the channel, not the whole
> >> connection. After that, all you should have to do is create a new
> >> channel and everything else should be ok. If that's not the case please
> >> let us know.
> >>
> >> Best wishes,
> >>
> >> Matthew
> >> _______________________________________________
> >> rabbitmq-discuss mailing list
> >> rabbitmq-discuss at lists.rabbitmq.com
> >> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
> >>
> >
> >
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> https://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
More information about the rabbitmq-discuss
mailing list