[rabbitmq-discuss] RabbitMQ internal error
Alexis Richardson
alexis.richardson at cohesiveft.com
Tue Mar 4 11:42:15 GMT 2008
Michael
Some of your client code might be useful for creating some of a
management tool. Might it be separable from your proprietary code,
and sharable in any form?
alexis
On Tue, Mar 4, 2008 at 11:01 AM, Michael Arnoldus <chime at mu.dk> wrote:
> Found the problem.
>
> Among other things we use AMQP/RabbitMQ as a transport for RPC style
> calls. A fast way to implement this was to create a new anonymous
> queue for the expected reply and then send the queue name in the
> 'reply to' field. We did and it worked, however we forgot two things:
> 1. Destroy the queue after it was used and 2. set the queue to auto-
> delete so if the module actually crashes, the queues gets deleted
> anyway. We have a watch-dog functionality that will ping all our AMQP
> modules, and in case of no reply (over some time) it will kill the
> module and make it restart.
>
> So when we ran out of queues RabbitMQ simply stopped responding,
> causing the watch-dog to kill the AMQP modules, they will restart and
> try again, ....
>
> The result was a heap of clients and a heap of queues.
>
> The fix was 3 things: Set RPC reply queues to auto-delete, destroy
> them actively after use or timeout, modify the watchdog so it wont
> kill anything unless it's actually able to ping itself through AMQP.
>
> Now everything works with a steady queue count.
>
> Thank to Tony for all the help in finding this bug. Your support is
> awesome!!!
>
> Regards,
>
> Michael Arnoldus
>
>
> On Feb 28, 2008, at 8:52 , Tony Garnock-Jones wrote:
>
>
>
> > Hi Michael,
> >
> > Michael Arnoldus wrote:
> >> Yesterday we experienced another problem with RabbitMQ. Possibly
> >> still our own fault, but this time a bit more severe. Suddenly from
> >> out of the blue it was impossible to send a single message through
> >> Rabbit. Even restart of the components connecting to rabbit didn't
> >> help. The erlang process stayed but didn't seem to work. Killing
> >> the beam process helped and everything returned to normal.
> >
> > This is extremely interesting.
> >
> > - What architecture are you running on? Is it a Mac?
> > - Was the CPU pinned to 100%?
> > - Were you able to issue commands at the Erlang prompt in the server?
> >
> > We are tracking down what we suspect to be a Mac-specific bug in the
> > Erlang runtime that manifests in some corner-cases of socket
> > shutdown - it would be interesting if you have detected the same
> > thing we're chasing. (We are still in the early stages of our
> > investigation - we can't say for sure yet whether it's really a
> > runtime problem.)
> >
> >> In a log file we had:
> >> ERROR 2008-02-26 16:17:32,857 --call got Closed:
> >> Method(name=close, id=60) (541, 'INTERNAL_ERROR', 0, 0) content =
> >> None
> >
> > If only the other log files hadn't been stomped on by the broker
> > startup! Your message has prompted us to fix this bad behaviour - we
> > have changed the startup scripts to move existing log files out of
> > the way, keeping the most recent few files.
> >
> > The INTERNAL_ERROR message is very interesting, because it indicates
> > a real bug in the broker. We don't see it in the case of the Mac bug
> > I mentioned earlier, so you might have found something different.
> >
> > This is probably the code that ran:
> >
> > lookup_amqp_exception(Other) ->
> > rabbit_log:warning("Non-AMQP exit reason '~p'~n", [Other]),
> > {true, ?INTERNAL_ERROR, <<"INTERNAL_ERROR">>, none}.
> >
> > ... which produces a "Non-AMQP exit reason" message in the log. I'm
> > afraid without that message, we'll have a tough time diagnosing this
> > one.
> >
> > Regards,
> > Tony
>
>
> _______________________________________________
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
> http://lists.rabbitmq.com/cgi-bin/mailman/listinfo/rabbitmq-discuss
>
>
--
Alexis Richardson
+44 20 7617 7339 (UK)
+44 77 9865 2911 (cell)
+1 650 206 2517 (US)
More information about the rabbitmq-discuss
mailing list