[rabbitmq-discuss] channel 1 badfun

Tim Watson tim at rabbitmq.com
Thu Aug 30 23:17:04 BST 2012


Hi John

On 25 Aug 2012, at 00:39, John Merrells wrote:

> 
> We have a two node cluster of RabbitMQ 2.8.6 on Erlang 5.8.5.
> 
> Sometimes one of the servers will peg the cpu and messages will
> stop passing through the system. We don't have a reliable way of
> reproducing this...
> 

There are a few conditions that can cause this to happen. What other symptoms happen when this occurs? Does memory usage go up as well? Does the CPU usage come down eventually, or get stuck?

> But, we do have some curious error reports in the logs. I've read
> through rabbit_amqqueue.erl and delegate.erl and I think that
> one server is telling the other to execute some function in 
> rabbit_amqqueue... but I don't see why that function would be 
> 'bad'.
> 
> Anyone any thoughts on the cpu issue, or this error report?
> 

Well we'd certainly like to get to the bottom of it asap. Can you please tell us a bit more about your setup please:

   a. cluster setup (i.e., disk vs ram nodes)
   b. how many queues, are there any HA queues?
   c. how you're interacting with the broker/cluster (publishers, consumers, exchanges and bindings and so on)

Also if there's any more of the logs (and sasl logs) for both nodes you're able to share, that would also help. If this occurs again and you're able to run `rabbitmqctl report` on the affected node, that would be great.

Cheers,
Tim


More information about the rabbitmq-discuss mailing list