[rabbitmq-discuss] Erlang crashes reports

Romary Kremer romary.kremer at gmail.com
Thu Sep 16 10:32:12 BST 2010


We are performing evaluation on rabbitmq message broker, and we  
currently encounter difficulties with release 2.0.0:

- Our application implies 10 000 peers producing messages periodically  
to a unique queue. This queue is listen asynchronously by another peer.
- All peer are written in Java.
- The production rate of a single peer is 4 messages / hour.
- We can simulate a time-consuming task in the consumer callback,  
simulating more or less fast consumer.
- we are using SSL certificate on the broker side to allow the peer to  
authenticate the broker.
	- we have noticed that the use of SSL as dramatic incidence on the  
memory occupied by Rabbitmq process

Since we upgraded to version 2.0.0, we are no longer able to make a  
test scenario running. The symptoms are listed bellow :

on the broker console first, we get the message :

Erlang has closed
Error: unable to connect to node rabbit at murphys: nodedown
diagnostics:
- nodes and their ports on murphys: [{rabbitmqctl22609,42767}]
- current node: rabbitmqctl22609 at murphys
- current node home dir: /var/lib/rabbitmq
- current node cookie hash: qu0gh1hg7j7LKyzK0GLk+A==

we have kept the erl_crash.dump in case, but since i's about 200 MB  I  
cannot do nothing to send it to you.
Maybe some one can give us some hints or some indicators to look out  
in the dump to help diagnostics, but we are not Erlang fluent !

What we know for sure is that the crash happens while the 10 000  
connections are established, at the beginning of the test.
We have monitored the  number of connections established and the  
crashes happens always around 4500 - 5000 connections, but never the  
same exact number.
We also tried with and without SSL but this does not help at all (same  
symptoms).

On the client side, our application registers a ShutdownListener to  
implement a connection retry logic upon shutdown.
The retries always failed with the error : connection refused.

here are some figures we gathered during the test start up about the  
maximum number of connection established before it crashes

- with SSL : 5404, 4493, 4399

- without SSL : 4673
			
we dont think that the problem is about file descriptors since we  
haven't changed anything in the configuration when we upgraded to 2.0.0.
The same test used to run successfully on previous version of the  
broker (1.7.2, and 1.8.1).
Moreover, the rabbit_status plugin tells us we have enough file  
descriptors as well as erlang processes
	- file descriptors (used / available)= 34 / 65535
	- elrang processes (used / available)= 160 / 1 000 000
	- memory (used / available)= 40 MB / 1609 MB

We haven't try the 2.1.0 yet because we would like to have your  
feedback about this issue before.

We would appreciate your feedbacks on that point before we migrate to  
release 2.1.0.

Best regards,

Romary.






-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.rabbitmq.com/pipermail/rabbitmq-discuss/attachments/20100916/6a8bd3bb/attachment-0001.htm>


More information about the rabbitmq-discuss mailing list