[rabbitmq-discuss] server crashes with very fast consumers
john at detreville.org
Fri Apr 8 21:37:08 BST 2011
I ran your tests on a Linux VM with much larger ulimits. I saw some problems, but not the crash you described.
Your scripts start up 1,000 Unix processes to declare 1,000 queues, then 1,000 processes to publish messages (each to "its" queue), then 1,000 processes to consume messages (each from "its" queue). Each publishing process publishes 3,000 messages; each consuming process consumes 3,000 messages.
When I run your test, many of the publishers and consumers die because they can't contact the RabbitMQ broker, or can't connect to it in time. This is not surprising, since a small VM with over 1,000 active processes will run very slowly. The failing processes die with messages like "Cannot connect to localhost:5672" and "Opening socket: Connection timed out" and so on.
Although these processes were temporarily unable to connect to the RabbitMQ broker, the broker itself seems to behave properly; it processes messages from and to the subset of publishers and consumers that were able to get through the initial connection storm. It had nothing unusual in its log.
Could you confirm that this is NOT the failure mode you saw? And how did you know that the RabbitMQ broker had crashed? Did the process go away? Or was it impossible to contact it later on? And how did you try to contact it?
On Mar 29, 2011, at 6:54 PM, John DeTreville wrote:
> My laptop has a lower ulimit hardwired in. I'll build a VM running another OS.
> On Mar 29, 2011, at 12:10 AM, alex chen wrote:
>> One more question. You say RabbitMQ crashes when you run these tests? And it crashes without writing anything interesting in th logs? Or printing anything to the console? It just exits?
> It crashed without any errors in the log.
> I found a problem with my the amqp_consumer.c that I sent last week. it did not send acknowledgement after consuming the messages. Attached please find the updated amqp_consumer with ack enabled. please use this for testing.
> Currently the tests would send 3000 messages to each queue (MESSAGE_COUNT=3000 in common.sh).
> If you find some fast consumers already finish consuming 3000 messages much earlier than other consumers, please change MESSAGE_COUNT=5000. this would reliably reproduce the broker crash. if you ran
> "top | grep beam" while all 1000 consumers are running, you can see its memory usage grows to more than 3500 MB before it crashes.
> rabbitmq-discuss mailing list
> rabbitmq-discuss at lists.rabbitmq.com
More information about the rabbitmq-discuss