[rabbitmq-discuss] Unexplained shutdown of RabbitMQ

Jason Zaugg jzaugg at gmail.com
Fri May 14 14:20:14 BST 2010


On Fri, May 14, 2010 at 2:04 PM, Matthew Sackman <matthew at lshift.net> wrote:
>> 1. Can we configure RabbitMQ to suppress logging of the message queue
>> when this error occurs.

> Unfortunately not. We would like to fix that ourselves too and there's
> meant to be a hook to be able to control that but there've been reports
> that there've been bugs in Erlang itself regarding that particular hook.
> It might be something we will be able to fix eventually.

No matter. Grep will do for now.

>> 2. What might "writer,send_failed,badarg" as the termination reason
>> suggest as the root cause?
>> 3. Prior the the shutdown, what is the meaning of:
>>
>>   exception on TCP connection <0.2021.0> from 10.30.33.169:3251
>>   {timeout,running}
>>   exception on TCP connection <0.1707.0> from 10.30.32.44:2692
>>   connection_closed_abruptly
>
> We think that you have heartbeats turned on. Heartbeats are quite
> unreliable under Windows, especially when the machine is loaded, because
> of schedular issues. Try turning heartbeats off (has to be done from
> each client). In later releases of Rabbit (and its clients), we disabled
> heartbeats by default. 1.6.0 is quite old - we'd recommend you upgrade
> to 1.7.2 if you can.

I looked at the apps a bit further, and they are using 1.5.4 of the
Java client, and the heartbeat is left at the default of 3 seconds.
Unfortuantely we need to rebuild the app to change the heartbeat, as
we didn't expose it as a property.

The trunk versions of the apps have been updated to 1.7.0, and we plan
to upgrade the broker soon.

The system is running smoothly again, so in the short term we'll:
- monitor for the next few days
- patch our apps to disable heartbeat if we hit problems again
- upgrade the clients and broker if this still is unstable.

Thanks for the help,

Jason



More information about the rabbitmq-discuss mailing list