[rabbitmq-discuss] RabbitMQ hangs, does not accept connections
matthias at rabbitmq.com
Fri Dec 30 12:11:30 GMT 2011
On 30/12/11 08:27, Dmitri Minaev wrote:
> I would be most grateful if you could have a look at our server.
I have done this now.
Rabbit indeed wasn't accepting connections - it wasn't refusing them
either, i.e. it was behaving as if 'accept' hadn't been called.
The Erlang process tasked with accepting AMQP connections was alive and
well. It was simply sitting there waiting for the tcp subsystem to
notify it of new connections. Alas that never happened.
So either the acceptor process forgot to tell Erlang's tcp stack to be
notified of new connections, or Erlang's tcp stack forgot that it was
supposed to tell the acceptor process...
...and it looks like there is a path in the tcp_acceptor.erl that would
trigger the former. When the tcp stack notifies the acceptor of an error
other than 'closed', the acceptor carries on but does not invoke
prim_inet:async_accept/2 to be notified of the next connection attempt.
I will file a bug for this. Should be easy to fix, though we cannot be
certain that this is definitely the problem.
Obviously if this was happening frequently we would have heard about the
issue a long time ago - the code in question hasn't changed for >3
years. So there must be some rare circumstances triggering this.
I got the acceptor process to issue another async_accept, so rabbit is
happy for the moment. But no doubt the problem will re-occur.
More information about the rabbitmq-discuss